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FIELD OF THE INVENTION 

15 mouse DNA T^" * ****** <° nwel h — -d 

mutaUd m™"^ * * " Whe " ~ «" 

results m the occurrence „ f ^ 

BACKGROUND OF THE INVENTION 

20 group of disea^ttueI Phy , iS 8 T M to * Aeneous 
large number "* h """»■ ^ l ™ » * 

- a Progressive loss of W ^ CharaCteristic oscular dystrophy 

the Pigmented ep 1, rSLT° n T"" ^ d<! ™°" ° f 
forms of macular dvT^ <mderlyin e »*»al macula. In many 

25 blindness ^"St ^ *— res ^ * 

abated ma^"^ ^ ~ ^ 

macular dystrophy (V^Tulh f*f ° > ™>™ 

dominant neovLcX^l * yndr ° me ^ 1B ' I 
^dative vitreoltin" !f , ^ "^""^"Pathy, faxUU 

30 as here^taTrcuT^LT ^ Ms » 

dystrophy ™r F or ' y " B " ft " teUifonn ^ar 

SuDivan 1^ 19 9 6 Mol^ °t t - 
<uger, iy96, Mol. Med. Today 2:380-386 

dominant j£ ^TTT ^ " «— - 

symptoms include a^aTf! ClMUhO0< ' 10 «*« 40 CU -«" 

' St 6arly Stases - an abnormal accumulation of the 
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20 
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in the retinal " ed epithe,ium (RPE > 

ymg tne macula. This g,ves ruse to a characteristic "egg yolk" 
appearance of the RPE and gradual !oss of visuaJ acuity WUh 
mcreasmg age, the RPE becomes more and more disorganized as the 

Place. These changes are accompanied by further loss of vision. 

similar t„ «. , r ° 8iCaI featUre8 Seen to BMD <™ in -any ways 
«mbr to the features seen in age-related macular dystrophy tL 

particular the discovery of thp • ^.anarn 

etlo<q , . &c °very ot the underlying genetic cause of BMD will 

shed light on age-related macular dystrophy as well 

for BMD reside^* 8 ^ ^ S ™ ™P<>nsMe 

BMD resides m the pericentric region of chromosome 11, at 11 Q 13 

near the markers DllS9*;fi vpPDir. , TT ' AA <1"> 

IX&a5b ' FCERl B»andUGB(Forsmanetal 1QQ9 
Chn. Genet. 42:156-159: Hou et al lQQfi w xx 7 ' 
Recentlv +h* „ ' Human Heredity 46:211-220). 

gently, the gene responsible for BMD was localized to a ~1 7 mB PAC 
contig lymg mostly between the markers DllSl7ftt an A unv* 
al 1QQ7 n« • "*<»^«.ers U11&1765 and UGB (Cooper et 

the BMD gene to a ^^^27^" 

D11S4076 and UGB (GrairetTu^t miCr0SateUite m arkers 

«d wran et al., 1997, Hum. Genet 101: 263-279) 

*«. a °" e * ffica)| y in diagnosing BMD is that carriers of the 
*-~d gene for BMD may be asymptomatic in terms of a^ty 
and morpholog-eal changes of the RPE observable in a routine ^ 

Potentia, between the ^^^72^^ 
asymptomatic BMD patients from norma, individual £T£ the 
re<ImreS •""tad. -P»sive equipment, is difficult to 
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alterative m ethod J ll ^ '* W0UW be *• have an 

5 not require the present ofTe^ *- 
For example, a LZ« C tet ^ " ^ «»■*—• 

patient suspected 1 a °" 3 bl0<>d Sample *» • 

ideal. °' being an asymptomatic carrier of BMD would be 

10 SUMMAHY OP THE INVENTION 

mouse DNA ^^T^ * — human and 

is respo^b7 e £ 'Cf which, when 

invention mel„A.. „_ " °1 ^trophy. The 

present 

5 the CG1CE protein TneT,, " WeU aS ^ °»t «*>>** 

*- fh,m otter ZJ^ZZTT ^ ^ " «*— ^ 

^NO,, The ^^SSS: ~ *~ i0 

substantially free from n+k , . ng CLrlCE Protein is 

, ~ sjwn -jsssst - — 

provided is CG1CE nrl! , " SE< 3 ID -NO ,28. Also 

human CQ^^^JT^ * «- ™A sequences. The 
the amino acid sU^s^TsE* m ^ ^ 
mouse CG1CE protein is suh^? If ^ °" 3 SE « ID NO,5. The 
*e amino acid ~ 2"^^ ~ - 
expressing CGlPP t,™+«- • Methods of 

genes. cect earners of mutant CGlCE 

BRIEF DESCRIPTION OP THE DRAWINGS 

CGlCE (SEQ^oTu i rIr. th 7 en0 f C ^ ° f h — 

— . H- start ATO^t^ a "^T" * ^ "~ 

11 are shown in bo.H ~ * " na 1,16 st °P TA * codon in exon 

AATAAA in m u ii^^^J*^**- — 

i ne alternatively spliced part of 
-3- 
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portion of exon n beginning ,t ^ 6 ° f conTeni ™ce The 

5 undated regi „n ^ g " P ° S,t,0n 15 " 788 »n— . nts the 3' 

of the i e^r C^TsTr »°' y <«°» 
untranslated resion of th« <• ^ ? ' re P resenti »S the 3'- 

Nat. Acad. Sci. 83: 7226-7230)- the " °'" I986 ' Proc ' 

10 of the MB., ,^ Blmm ";' h c e ™ g6ne was >«« — w. to be a part 
deterged by ^-a-J^^"^ *» BMD as 

5 P ° S,ti0n ^ "» ™A atop codon is at p„^ on ^ 0 " * 

■on 8 fo™ of huZ CG h °CE D tt C °T^ amia0 rf 

the human CGlCE^^ T ! Q ro N ° :3) ™« 1m * f »™ °f 

of CG1CE cDNA " PrMUCed by Nation of the short form 

CG1CE cDNA is produced len ^ l^ ° f th ° huma » 

utilised in intron 7 The ATG 7T ^ temabve s P hce "Jonor site is 
codon is at position mo ^ ta * "« W « the TGA stop 

short fo ra JE^^m""**' ^ ° fthe 

ibnn of the hun^S^f. ~, <8 *> JI »«>-*>- This short 
•n of CG1CE cW " Pr0dUCed by Nation of the long 

fragments i£Z£!£Z "Vf " of PCR 

three indiviH.,., XZT ™ 4 "* «*— * ^onic 

regions from 

affected with BmXl ^ ° f wh ° m are 

(Wo^ous arZd^ X • "** ^ 

(heteroozygous """^ -** 8M 

fnonna, control, unaffected fJ^^Z ""^ " S1 " 3 
(affected with BMD) anti-sen^ 7 "notation; patient Sl-5 

Mt " station; patient SU (affected with 
- 4 - 
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o B ^^T e H ° rie : tati0n; " Sl " 3 ( ~ -ti-sense 
onentata. Reading from left to right, the mutation shows up at 

The moftafeion in family Sl changes tryptophan to cysteine 

CGlCE ™t FlgUr6 J Sh ° WS 3 mUltiple S6qUenCe aKfimnent of human 

7 ^ SeqU6nCeS ° f rGlated ~ s fr - CL 
Belated proteins from C. elegans were identified by BLASTP 

analyse non-redundant GenBank database. This figure Lows Zt 

two arn^ aci ds mutated in two different Swedish fsJZs wi* BMD 

pr^frlT^^ ^of^aT 
protems from C. e/^ans contain a tryptophan at the position of the 
mutate family Sl, as does the wild-type CGlCE gene. Online C 
n^T n<>t ^ 3 at *• P^on of the" 

c7ant7lJ n T (aC — nUmb6r P345?7) > is 

sMaf JT 1S f 0f ^ Ctl0nal P^lanine (phenylalanine is highly 

MIT 0 **" " *"* * 3180 iS 3 h ^phobi C aromatic ammo 
Again, an 16 related proteins from C. e/e^ contain tyrosine or 

iti^etr nr s this position ( ~ is ^ — - 

.y «*tmne in tnat it also is an aromatic amino acid) 

cDNA (SEC IDNO 6 f C ° mpIete SCqUenCe ° f mouse CG1 CE 

cuna (SEQ.ID.NO..-28) and mouse CGlCE protein (SEQ.ID NO 29) 

Figure 9A-B shows an alignment of the amino acid 
sequences of the long form of human CGlCE protein (SEO ID NO h 

rssar* (seqjd no :29 >- * - cS-^r 

Figure 10A-C shows the results of m site hybridization 
localized to the retmal p,gmented epithelium cells (RPE). Figure 10A 

lC« o^e ° G1CE ^ ~ to the ~ <*" 

ana" IT n a " tiSenSe Pr ° be Str0Dgly W««tad to the M oeUs 
^»»t to the eeUsofthe other layers of the retoa. Figure 10B shows 
the results usmg a sense CGlCE probe as a control. T^sonse Ibe 
does not hybridize to CGI n? m BMi _. *ne sense probe 

10 mRNA and does not label the RPE cells. 
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Figure IOC is a higher magnification of the RPE cells from Figure 10A 
Human CG1CE mRNA shows a similar distribution, being confined to 
the RPE cells of the human retina. 

DETAILED DESCRIPTION OF THE INVENTION 
For the purposes of this invention: 

"Substantially free from other proteins" means at least 90% 
preferably 95%, more preferably 99%, and even more preferably 99 9% 
free of other proteins. Thus, a CG1CE protein preparation that is 
substantially free from other proteins will contain, as a percent of its 
total protein, no more than 10%, preferably no more than 5%, more 
preferably no more than 1%, and even more preferably no more than 
0.1%, of non- CG1CE proteins. Whether a given CGlCE protein 
preparation is substantially free from other proteins can be determined 
by such conventional techniques of assessing protein purity as, e.g , 
sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) 
combined with appropriate detection methods, e.g., silver staining or 
immunoblotting. 

"Substantially free from other nucleic acids" means at least 
90%, preferably 95%, more preferably 99%, and even more preferably 
99.9%, free of other nucleic acids. Thus, a CGlCE DNA preparation that 
is substantially free from other nucleic acids will contain, as a percent of 
its total nucleic acid, no more than 10%, preferably no more than 5% 
more preferably no more than 1%, and even more preferably no more 
than 0.1%, of non- CGlCE nucleic acids. Whether a given CGlCE DNA 
preparation is substantially free from other nucleic acids can be 
determined by such conventional techniques of assessing nucleic acid 
punty as, e.g., agarose gel electrophoresis combined with appropriate 
staining methods, e.g., ethidium bromide staining, or by sequencing. 

A "conservative amino acid substitution" refers to the 
replacement of one amino acid residue by another, chemically similar 
amino acid residue. Examples of such conservative substitutions are- 
substitution of one hydrophobic residue (isoleucine, leucine, valine, or 
methionine) for another; substitution of one polar residue for another 
polar residue of the same charge (e.g., arginine for lysine; glutamic acid 
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for aspartic add); substitution of one aromatic amino acid (tryptophan 
tyrosine, or phenylalanine) for another. Wptopban, 

,1 • , r, Pr6Sent invention relat es to the identification and 

5 tZ^ T f' 3 ^ WWCh ' Wh6n mUt3ted ' is -PonsibleTor 
5 Bests macular dystrophy. That CGlCE is the Best's macular dystrophy 
gene ,s supported by various observations: dystrophy 

h„ man u ^ CG1CE mapS to the S enetica "y defined region of 
human chromosome ll q 12- q l3 that has been shown to contain the Best's 

10 Td 4 6 ln7t r ?ene - CG1CE iS Present ° n two PAC «°™> 759^2 

Z I PrGC1Se,y ^ thG m ° St narrowl y ^fined region that 

«r ;t jr ontain cgice (c °° per - * i > ^ * £L 

192,St 1 hr^aZ.,199 7 ,GenomeRes.8:48-56;Graff^ a / 1997 Hnm 
Genet. 101: 263-279). " ' Hum 

15 I CG1CE is jessed predominately in the retina. 

J. In patients having Best's macular dystrophy CGlCE 
contains mutations in evolutionary conserved amino Lids 

(FTm tw k 4 ' k Th t CGlCE genomic clones contain another gene 

d3onh r PhySiC3lly aSS ° dated the Best ' s ocular 

dysti-ophy region (Cooper et aL, 1997, Genomics 41:185-192; Stihre, al 

The Fmr;™ 8 " 56; Graff " az - 1997 « Hum - Genet - ^i^;. 

lel plal T ^ ° riented the ^tween 

tneir polyadenylation signals is 132 bp. 

i. ■ I* 1 ' PreSent invention Provides DNA encoding CGlCE that 

is substantially free from other nucleic acids Th» „ f 
5 al<:o™v„rfj. i. nucieic acids. The present invention 

also provides recombinant DNA molecules encoding CGlCE The 
present mvention provides DNA mo.ecules substantially free fr™ other 

SEQ.ID.NO,!. Analysis of SEQ.ID.NO.:l revealed that this generic 
sequence defines a gene having n exons. These exons coUec^Taave 

a shorter „ , " Pr ° dUCed - AKh ° Uf!h «* -DNA contains 

a shorter open readme frame of 1,305 bases (due to the presence of a 

SSTJ^ r Thus - *• present *» *- 

mol «» 1 » «*°<W two forms of CGlCE protein that are 



7- 



WO 99/43695 



PCT/US99/03790 



SEQ.ID.NO.r4. **t'UJ.ssu..2 and in Figure 4 as 



25 



30 



35 



5 ^UmyL'ZTo^r^ indUdeS 

invention includes DNA m „l T^f °" 4 " A <*ordingl y , the present 
— having a ^ueTZn *- *» °ther nudeic 

and position^ iaS,2^^^^/«WI»-ND. a 
10 DNA molecules having a nucWvi rec °">Mnant 
1,859 ofSEQ ID ^'f* SeqUenM Uprising positions 105- 
W JUN0..2 and posrbons 105-1,409 of SEQ.ID NO -4 

> AA318352 £Z5££ S^^tT* <aCC<i8Si0n nUmbere 
are accession numbers AA3071 , o ^ * this cDNA 

*om neuronal cel, ZTT^ZTT^^ AA2 ° 5892 
true mouse ortholog of th e CGlCT !! A 
AA497726 (from mouse tertU) " rePreSent<!d in *• -"use EST 

cgice, in 2:zt Z A dTr r e r ent ; ~ ^ 

DNA sequences to ^ch^t 7 DNA M — ' 

"recombinant DNA ^ ^£ ^"s ' bT * 
can include DNA sequences th„ t T , * 0ther ""l™™*! 

tract or the myc epitope. Th^!v el DN f "* **' " 
invention can be inJL, ? sequences of the present 

ve^s, pi a^rrcr : ^ssv-* *- 

Included in *h artificial chromosomes. 

_ Q 
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salmon sperm DNA S J ° 1Uti0n - and 100 ^ 

5 sperm DNA J£S x7o Z t sL? ^ denatU " d 

is done at 37*C I , " £ ^ °L ^ ^ ^ W " U »« ° f 

is followed by a wash ta 0 ™ n 2X ^ ° 1% SDS ™« 

autoradiography. ' ° S ° S at 5 °° C ftr 45 «*»• before 

.0 would indude^rThybrid! T* C ° nditi ° nS ° f "* 

" carrying ^5!^^^^^ — *~ «- 

Details of the cL pos itir„Tth "* ^ kn ° Wn in *• 

Sambrook, Fritsch ^ M^^T* * 2T " *" 6 *' 
J ^^D Jla l, second i^SS^?™^ 
Press. In addition to th„ <• P g Harbor Laboratory 

» which maybe '?ZZ"~ *"+ 

two amino J£ toTZ if "* iS "* »"* «" «" 

acid. This allot fori . ^ ^ enCodes a P^icular amino 

CG 1CE pro^wh e ^~ 0D ° f Synth6tiC DNA «"* - 
25 differs significant ^JT*?*."*- 1 " ° f the DNA 

4, but -E^^£^ t r~ »f S EQ ,D.NO, :2 or 
Such synthetic DNA., „ ! , ***** 38 SE Q ID.NOs.:2 or 4. 
invent^ 8 ™ * ntended 10 be ^ *• scope of the present 

30 within the J^^*^ 0 ^ *' « * « "tended to be 
of SEQ.ID.NOs. t * whIT In ««™ 
within the s M pe rfa" n ^ * ^ mSCular *— *T are 

invention inZ^m ^ TT™ *• Present 

identical to SEQ.1D NO l^^fu fT* " se< »— <* that is 

« SEQ.rD.NO.:l i T A ^T ^ ° nade0<icle " P 08 "™ 7 . 2 «9 of 

' ° r ° ^ than ' » «>at the codon at positions 
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7,257-7,259 encodes either cysteine or is a stop codon rather than 
encoding tryptophan. Also included in the present invention is a DNA 
molecule having a nucleotide sequence that is identical to SEQ.ID.NO 1 
except that at least one of the nucleotides at position 7,257 or 7,258 has 
been changed so that the codon at positions 7,257-7,259 does not encode 
tryptophan. 

The present invention includes a DNA molecule having a 
nucleotide sequence that is identical to positions 105-1,859 of 
SEQ.ID.NO.:2 except that the nucleotide at position 383 is T, A, or C 
rather than G, so that the codon at positions 381-383 encodes either 
cysteine or is a stop codon rather than encoding tryptophan. Also 
included in the present invention is a DNA molecule having a nucleotide 
sequence that is identical to positions 105-1,859 of SEQ.ID.NO. :2 except 
that at least one of the nucleotides at position 381 or 382 has been 
changed so that the codon at positions 381-383 does not encode 
tryptophan. 

The present invention includes a DNA molecule having a 
nucleotide sequence that is identical to positions 105-1,409 of 
SEQ.ID.NO.:4 except that the nucleotide at position 383 is T, A, or C 
rather than G, so that the codon at positions 381-383 encodes either 
cysteine or is a stop codon rather than encoding tryptophan. Also 
included in the present invention is a DNA molecule having a nucleotide 
sequence that is identical to positions 105-1,409 of SEQ.ID.NO.:4 except 
that at least one of the nucleotides at position 381 or 382 has been 
changed so that the codon at positions 381-383 does not encode 
tryptophan. 

The present invention includes a DNA molecule having a 
nucleotide sequence that is identical to SEQ.ID.NO.:l except that the 
nucleotide at position 7,233 of SEQ.ID.NO.:l is C, A, or G rather than T 
so that the codon at positions 7,233-7,235 does not encode tyrosine. Also ' 
included in the present invention is a DNA molecule having a nucleotide 
sequence that is identical to SEQ.ID.N0..1 except that at least one of the 
nucleotides at position 7,234 or 7,235 has been changed so that the codon 
at positions 7,233-7,235 does not encode tyrosine. 

The present invention includes a DNA molecule having a 
nucleotide sequence that is identical to positions 105-1,859 of 
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*l f 6XCePt ^ nUCle ° tide at P° sition 357 is C, A, or G 

tttsTn M ' S ° ** C ° d0n ^ P0Siti ° nS 357 " 359 *~ encode 
t^ne. Also mcluded in the present invention is a DNA mo lecule 

5 SeT^NC > 2 nr nCe " id6ntiCal to P ° Siti0nS 105 - 1 ' 8 ^ of 

r ^ ^ ^ ° f ^ nUCle ° tideS at P-^- 358 or 
359 has been changed so that the codon at positions 357-359 does not 
encode tyrosine. 

The P res ™» invention includes a DNA molecule having a 

10 SFoTn trr""* UU * 18 MentiCal 40 P 08 ""™ 1051 .«9 of 

SEQ ID^NO ,4 except that the nucleotide at position 357 is C, A, or G 

t™^' J' S ° f ^ COd ° n at P0Sia<>nS 357 " 359 — "° encode 
tvrosme. Also mduded in the present invention is a DNA molecule 

^om a No 4 ^ T" th " t is identical *° posiao - »M» of 

The present invention includes a DNA molecule having a 
nudeotide sequence that is identical to SEQ.ID.NO,! except that thf 

0 ~ r iU ° D 3 ' 330 " ° ^ *» A - in ^ed in toe 
present .nvention „ a DNA molecule having a nucleotide sequence that 

SEqTd C ^ N0 :1 6XCePt ^ "»*"«• - Pos^n s 3^ f 
3,330 3,332 does not encode threonine. Also included in the present 
mvenhon ,s a DNA molecule having a nucleotide sequence thatT 
■ .denial U, SEQ.ID.NO,! except that at least one of the" udeo^d s at 

3 p 73r; 3 3 2 33 d , o or T ba : been *-« °° - «*- 

<MdU-3,332 does not encode threonine. 

„,„.! ha PreSent invention tocl «des a DNA molecule having a 

SEO m ™T nCe *■* " idenaCaJ t0 »«*"» 105 -1.«59 of 
f dT'!^ ^ nUde0tide 34 »"* ta 120 " C ""her than 

L H,. " 6 PreSeDt ™ Venti0n is a DNA having a 

seo m MoT"" ** " 10 "»>* of 

™th.T f T *"* ^ nUde0ade at P° siti » n U» - G. C, or T 
rather than A, so that the codon at positions 120-122 does not encode 
ttreomne. Also induded in the present invention is a DNA mofccule 
havmg a nudeotide sequence that is identical to positions 105-T.8 ^f 
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SEQ.I©J*0.:2 except that at least one of the nucleotides at position 120 or 
121 haslbeen changed so that the codon at positions 120-122 does not 
encode tdbreonine. 

The present invention includes a DNA molecule having a 
nucleoid* sequence that is identical to positions 105-1,409 of 
SEQ.DUK>.:4 except that the nucleotide at position 120 is C rather than 
A. A1 SQ mcluded in the present invention is a DNA molecule having a 
nucleotide sequence that is identical to positions 105-1,409 of 
SEQ.ID.NO.:4 except that the nucleotide at position 120 is G, C, or T 
rather than A, so that the codon at positions 120-122 does not encode 
threonine. Also included in the present invention is a DNA molecule 
having a nucleotide sequence that is identical to positions 105-1,409 of 
SEQJD MO,A except that at least one of the nucleotides at position 120 or 
121 has been changed so that the codon at positions 120-122 does not 
15 encode threonine. 

The present invention includes a DNA molecule having a 
nucleotide sequence that is identical to SEQ.ID.NO.rl except that the 
nucleotide at position 8,939 is A rather than T. Also included in the 
present invention is a DNA molecule having a nucleotide sequence that 
is identical to SEQ.ID.NO.:l except that the nucleotide at position 8,939 of 
SEQ.ID.NO,! is A, G, or C, rather than T, so that the codon at positions 
8,939-8,941 does not encode tyrosine. Also included in the present 
invention is a DNA molecule having a nucleotide sequence that is 
identical to SEQ.ID.NO.rl except that at least one of the nucleotides at 
position 8,939-8,941 has been changed so that the codon at positions 8,939- 
8,941 does not encode tyrosine. 

The present invention includes a DNA molecule having a 
nucleotide sequence that is identical to positions 105-1,859 of 
SEQ.ID.NO.:2 except that the nucleotide at position 783 is A rather than 
T, Also included in the present invention is a DNA molecule having a 
nucleotide sequence that is identical to positions 105-1,859 of 
SEQ.ID.NO.:2 except that the nucleotide at position 783 is A, G, or C 
rather than T so that the codon at positions 783-785 does not encode 
tyrosine. Also included in the present invention is a DNA molecule 
having a nucleotide sequence that is identical to positions 105-1,859 of 
SEQ.ID.NO,2 except that at least one of the nucleotides at position 783- 
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785 has been changed so that the codon at positions 783-785 does not 
encode tyrosine. 

The present invention includes a DNA molecule having a 
nucleotide sequence that is identical to positions 105-1,409 of 
SEQ.ID.NO.r4 except that the nucleotide at position 783 is A rather than 
T. Also included in the present invention is a DNA molecule having a 
nucleotide sequence that is identical to positions 105-1,409 of 
SEQ.ID.NO.:4 except that the nucleotide at position 783 is A, G, or C 
rather than T, so that the codon at positions 783-785 does not encode 
tyrosine. Also included in the present invention is a DNA molecule 
having a nucleotide sequence that is identical to positions 105-1,409 of 
SEQ.ID.NO.:4 except that at least one of the nucleotides at position 783- 
785 has been changed so that the codon at positions 783-785 does not 
encode tyrosine. 

The present invention includes a DNA molecule having a 
nucleotide sequence that is identical to SEQ.ID.NO.rl except that the 
nucleotide at position 11,241 is A rather than G. Also included in the 
present invention is a DNA molecule having a nucleotide sequenceThat 
is identical to SEQ.ID.NO.:! except that the nucleotide at position 11,241 
is A, C, or T, rather than G, so that the codon at positions 11,240-11 242 
does not encode glycine. Also included in the present invention is a 
DNA molecule having a nucleotide sequence that is identical to 
SEQ.ID.NO.: 1 except that at least one of the nucleotides at position 11,240 
or 11,241 has been changed so that the codon at positions 11,240-11,242 
25 does not encode glycine. 

The present invention includes a DNA molecule having a 
nucleotide sequence that is identical to positions 105-1,859 of 
SEQ.ID.NO.r2 except that the nucleotide at position 1,000 is A rather 
than G. Also included in the present invention is a DNA molecule 
having a nucleotide sequence that is identical to positions 105-1,859 of 
SEQ.ID.NO.:2 except that the nucleotide at position 1,000 is A, C or T 
rather than G, so that the codon at positions 999-1,001 does not encode 
glycine. Also included in the present invention is a DNA molecule 
having a nucleotide sequence that is identical to positions 105-1,859 of 
SEQ.ID.NO.:2 except that at least one of the nucleotides at position 999 or 
1,000 has been changed so that the codon at positions 999-1,001 does not 
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encode glycine. Another aspect of the present invention includes host 
cells that have been engineered to contain and/or express DNA 
sequences encoding CG1CE protein. Such recombinant host cells can be 
cultured under suitable conditions to produce CG1CE protein. An 
expression vector containing DNA encoding CG1CE protein can be used 
for expression of CGlCE protein in a recombinant host cell. 
Recombinant host cells may be prokaryotic or eukaryotic, including but 
not limited to, bacteria such as E. coli, fungal cells such as yeast, 
mammalian cells including, but not limited to, cell lines of human, 
bovine, porcine, monkey and rodent origin, and insect cells including 
but not limited to Drosophila and silkworm derived cell lines. Cell lines 
derived from mammalian species which are suitable for recombinant 
expression of CGlCE protein and which are commercially available, 
include but are not limited to, L cells L-M(TK-) (ATCC CCL 1 3) L cells 
15 L-M (ATCC CCL 1.2), 293 (ATCC CRL 1573), Raji (ATCC CCL 86) CV-1 
(ATCC CCL 70), COS-1 (ATCC CRL 1650), COS-7 (ATCC CRL 1651) 
CHO-K1 (ATCC CCL 61), 3T3 (ATCC CCL 92), NIH/3T3 (ATCC CRL 
1658), HeLa (ATCC CCL 2), C127I (ATCC CRL 1616), BS-C-1 (ATCC CCL 
26) and MRC-5 (ATCC CCL 171). 

A variety of mammalian expression vectors can be used to 
express recombinant CGlCE in mammalian cells. Commercially 
available mammalian expression vectors which are suitable include 
but are not limited to, pMClneo (Stratagene), pSG5 (Stratagene), 
pcDNAI and pcDNAIamp, p C DNA3, pcDNA3.1, pCRS.l (Invitrogen), 
25 EBO-pSV2-neo (ATCC 37593), pBPV-l(8-2) (ATCC 37110), pdBPV- 
MMTneo(342-12) (ATCC 37224), pRSVgpt (ATCC 37199), pRSVneo 
(ATCC 37198), and pSV2-dhfr (ATCC 37146). Following expression in 
recombinant cells, CGlCE can be purified by conventional techniques to 
a level that is substantially free from other proteins. 

The present invention includes CGlCE protein 
substantially free from other proteins. The amino acid sequence of the 
full-length CGlCE protein is shown in Figure 3 as SEQ.ID.NO 3 Thus 
the present invention includes CGlCE protein substantially free from 
other proteins having the amino acid sequence SEQ.ID.NO.:3. Also 
included in the present invention is a CGlCE protein that is produced 
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from an alternatively spliced CG1CE mRNA where the protein has the 
amino acid sequence shown in Figure 5 as SEQJD.NO.:5. 

Mutated forms of CG1CE proteins are intended to be within 
the scope of the present invention. In particular, mutated forms of 
5 SEQ.ID.NOs.:3 and 5 that give rise to Best's macular dystrophy are 
within the scope of the present invention. Accordingly, the present 
invention includes a protein having the amino acid sequence shown in 
Figure 3 as SEQ.ID.NO.:3 except that the amino acid at position 93 is 
cysteine rather than tryptophan. The present invention also includes a 

10 protein having the amino acid sequence shown in Figure 5 as 

SEQ.ID.NO.:5 except that the amino acid at position 93 is cysteine rather 
than tryptophan. The present invention includes a protein having the 
amino acid sequence shown in Figure 3 as SEQ.ID.NO.:3 except that the 
amino acid at position 93 is not tryptophan. The present invention also 

15 includes a protein having the amino acid sequence shown in Figure 5 as 
SEQ.ID.NO.:5 except that the amino acid at position 93 is not tryptophan. 

The present invention includes a protein having the amino 
acid sequence shown in Figure 3 as SEQ.ID.NO.:3 except that the amino 
acid at position 85 is histidine rather than tyrosine. The present 

20 invention also includes a protein having the amino acid sequence shown 
in Figure 5 as SEQ.ID.NO.:5 except that the amino acid at position 85 is 
histidine rather than tyrosine. The present invention includes a protein 
having the amino acid sequence shown in Figure 3 as SEQ.ID.N(X:3 
except that the amino acid at position 85 is not tyrosine. The present 

25 invention also includes a protein having the amino acid sequence shown 
in Figure 5 as SEQ.ED.NO.:5 except that the amino acid at position 85 is 
not tyrosine. 

The present invention includes a protein having the amino 
acid sequence shown in Figure 3 as SEQ.ID.NO.:3 except that the amino 

30 acid at position 6 is proline rather than threonine. The present 

invention also includes a protein having the amino acid sequence shown 
in Figure 5 as SEQ.ID.NO.:5 except that the amino acid at position 6 is 
proline rather than threonine. The present invention includes a protein 
having the amino acid sequence shown in Figure 3 as SEQ.ID.NO.:3 

35 except that the amino acid at position 6 is not threonine. The present 

invention also includes a protein having the amino acid sequence shown 
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in Figure 5 as SEQ.ID.NO.:5 except that the amino acid at position 6 is 
not threonine. 

The present invention includes a protein having the amino 
acid sequence shown in Figure 3 as SEQ.ID.NO.:3 except that the amino 
5 acid at position 227 is asparagine rather than tyrosine. The present 

invention also includes a protein having the amino acid sequence shown 
in Figure 5 as SEQ JD.NO.:5 except that the amino acid at position 227 is 
asparagine rather than tyrosine. The present invention ( includes a 
protein having the amino acid sequence shown in Figure 3 as 

10 SEQ.ID.NO.:3 except that the amino acid at position 227 is not tyrosine. 
The present invention also includes a protein having the amino acid 
sequence shown in Figure 5 as SEQ. ID. NO. :5 except that the amino acid 
at position 227 is not tyrosine. 

The present invention includes a protein having the amino 

15 acid sequence shown in Figure 3 as SEQ.ID.NO.:3 except that the amino 
acid at position 299 is glutamate rather than glycine. The present 
invention includes a protein having the amino acid sequence shown in 
Figure 3 as SEQ.ID.NO.:3 except that the amino acid at position 299 is 
not glycine. As with many proteins, it is possible to modify many of the 

20 amino acids of CG1CE and still retain substantially the same biological 
activity as the original protein. Thus, the present invention includes 
modified CG1CE proteins which have amino acid deletions, additions, or 
substitutions but that still retain substantially the same biological 
activity as CG1CE. It is generally accepted that single amino acid 

25 substitutions do not usually alter the biological activity of a protein (see, 
e.g., Molecular Biology of the Gene. Watson et aL, 1987, Fourth Ed., The 
Benjamin/Cummings Publishing Co., Inc., page 226; and Cunningham 
& Wells, 1989, Science 244:1081-1085). Accordingly, the present invention 
includes polypeptides where one amino acid substitution has been made 

30 in SEQ.ID.NOs.:3 or 5 wherein the polypeptides still retain substantially 
the same biological activity as CG1CE. The present invention also 
includes polypeptides where two amino acid substitutions have been 
made in SEQ.ID.NOs.:3 or 5 wherein the polypeptides still retain 
substantially the same biological activity as CG1CE. In particular, the 

35 present invention includes embodiments where the above-described 

substitutions are conservative substitutions. In particular, the present 
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invention includes embodiments where the above-described 
substitutions do not occur in positions where the amino acid present in 
CG1CE is also present in one of the C. elegans proteins whose partial 
sequence is shown in Figure 7. 

The CG1CE proteins of the present invention may contain 
post-translational modifications, e.g., covalently linked carbohydrate. 

The present invention also includes chimeric CG1CE 
proteins. Chimeric CG1CE proteins consist of a contiguous polypeptide 
sequence of at least a portion of a CG1CE protein fused to a polypeptide 
sequence of a non- CG1CE protein. 

The present invention also includes isolated forms of 
CG1CE proteins and CG1CE DNA. By "isolated CG1CE protein" or 
"isolated CG1CE DNA" is meant CG1CE protein or DNA encoding 
CG1CE protein that has been isolated from a natural source. Use of the 
term "isolated" indicates that CG1CE protein or CG1CE DNA has been 
removed from its normal cellular environment. Thus, an isolated 
CG1CE protein may be in a cell-free solution or placed in a different 
cellular environment from that in which it occurs naturally. The term 
isolated does not imply that an isolated CG1CE protein is the only protein 
present, but instead means that an isolated CG1CE protein is at least 
95% free of non-amino acid material (e.g., nucleic acids, lipids, 
carbohydrates) naturally associated with the CG1CE protein. Thus, a 
CG1CE protein that is expressed in bacteria or even in eukaryotic cells 
which do not naturally (i.e., without human intervention) express it 
through recombinant means is an "isolated CG1CE protein." 

A cDNA fragment encoding full-length CG1CE can be 
isolated from a human retinal cell cDNA library by using the 
polymerase chain reaction (PCR) employing suitable primer pairs. 
Such primer pairs can be selected based upon the cDNA sequence for 
CG1CE shown in Figure 2 as SEQ.ID.NO.:2 or in Figure 4 as 
SEQ.BD.NO.:4. Suitable primer pairs would be, e.g. : 

CAGGGAGTCCCACCAGCC (SEQ.ID.NO.:6) and 

TCCCCATTAGGAAGCAGG (SEQ.ID.NO.:7) 

for SEQ.ID.N(X:2; and 

CAGGGAGTCCCACCAGCC (SEQ.ID.NO.:6) and 
TCTCCTCTTTGTTCAGGC (SEQ.ID.NO.:8) 
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for SEQ.ID.NO.:4. 

PCR reactions can be carried out with a variety of 
thermostable enzymes including but not limited to AmpliTaq, AmpliTaq 
Gold, or Vent polymerase. For AmpliTaq, reactions can be carried out 
in 10 mM Tris-Cl, pH 8.3, 2.0 mM MgCl2, 200 uM for each dNTP, 50 mM 
KC1, 0.2 uM for each primer, 10 ng of DNA template, 0.05 units/ul of 
AmpliTaq. The reactions are heated at 95°C for 3 minutes and then 
cycled 35 times using the cycling parameters of 95°C, 20 seconds, 62°C, 
20 seconds, 72°C, 3 minutes. In addition to these conditions, a variety of 
suitable PCR protocols can be found in PCR Primer. A Lahnratnrv 
Manual, edited by C.W. Dieffenbach and G.S. Dveksler, 1995, Cold 
Spring Harbor Laboratory Press; or PCR Protocols: A Quid* to Methods 
and Applications, Michael et al., eds., 1990, Academic Press . 

A suitable cDNA library from which a clone encoding 
CG1CE can be isolated would be Human Retina 5'-stretch cDNA library 
in lambda gtlO or lambda gtll vectors (catalog numbers HLll43a and 
HL1132b, Clontech, Palo Alto, CA). The primary clones of such a library 
can be subdivided into pools with each pool containing approximately 
20,000 clones and each pool can be amplified separately. 

By this method, a cDNA fragment encoding an open 
reading frame of 585 amino acids (SEQ.ID.NO.:3) or an open reading 
frame of 435 amino acids (SEQ.ID.NO.:5) can be obtained. This cDNA 
fragment can be cloned into a suitable cloning vector or expression 
vector. For example, the fragment can be cloned into the mammalian 
expression vector pcDNA3.1 (Invitrogen, San Diego, Ca). CGlCE 
protein can then be produced by transferring an expression vector 
encoding CGlCE or portions thereof into a suitable host cell and growing 
the host cell under appropriate conditions. CGlCE protein can then be 
isolated by methods well known in the art. 

As an alternative to the above-described PCR method, a 
cDNA clone encoding CGlCE can be isolated from a cDNA library using 
as a probe oligonucleotides specific for CGlCE and methods well known 
in the art for screening cDNA libraries with oligonucleotide probes. 
Such methods are described in, e.g., Sambrook et al, 1989, Molecular 
Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory, Cold 
Spring Harbor, New York; Glover, D.M. (ed.), 1985, DNA Cloning: A 
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PractictE Approach, MRL Press, Ltd., Oxford, U.K., Vol. I, II. 
Oligonudbotides that are specific for CG1CE and that can be used to 
screen dHN A libraries can be readily designed based upon the cDNA 
sequenced CG1CE shown in Figure 2 as SEQ.ID.NO.:2 or in Figure 4 as 
5 SEQ.IDJBD.:4 and can be synthesized by methods well-known in the art. 

Genomic clones containing the CG1CE gene can be obtained 
from commercially available human PAC or BAC libraries available 
from Research Genetics, Huntsville, AL. PAC clones containing the 
CGlCE^aae (e.g., PAC 759J12, PAC 466A11) are commercially available 
10 from Research Genetics, Huntsville, AL (Catalog number for individual 
PAC clcnaffis is RPCI.C). Alternatively, one may prepare genomic 
libraries,, especially in PI artificial chromosome vectors, from which 
genomic eiones containing the CG1CE can be isolated, using probes 
based upm the CG1CE sequences disclosed herein. Methods of 
15 preparing such libraries are known in the art (Ioannou et al., 1994, 
Nature t&snet. 6:84-89). 

The novel DNA sequences of the present invention can be 
used in various diagnostic methods relating to Best's macular 
dystroplDsy, The present invention provides diagnostic methods for 
20 deterrniiang whether a patient carries a mutation in the CG1CE gene 
that predisposes that patient toward the development of Best's macular 
dystrophy. In broad terms, such methods comprise determining the 
DNA sequence of a region of the CG1CE gene from the patient and 
comparing that sequence to the sequence from the corresponding region 
25 of the CG1CE gene from a normal person, i.e., a person who does not 
suffer from Best's macular dystrophy. 

Such methods of diagnosis may be carried out in a variety of 
ways. For example, one embodiment comprises: 

(a) providing PGR primers from a region of the CG1CE 
30 gene where it is suspected that a patient harbors a mutation in the 

CG1CE gene; 

(b) performing PGR on a DNA sample from the patient 
to produce a PGR fragment from the patient; 

(c) performing PCR on a control DNA sample having a 
35 nucleotide sequence selected from the group consisting of 

SEQ.ID.NOs.:l, 2 and SEQ.ID.NO.:4 to produce a control PCR fragment; 
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(d) determining the nucleotide sequence of the PCR 
fragment from the patient and the nucleotide sequence of the control 
PCR fragment; 

(e) comparing the nucleotide sequence of the PCR 
fragment from the patient to the nucleotide sequence of the control PCR 
fragment; 

where a difference between the nucleotide sequence of the 
PCR fragment from the patient and the nucleotide sequence of the 
control PCR fragment indicates that the patient has a mutation in the 
CG1CE gene. 

In a particular embodiment, the PCR primers are from the 
coding region of the CG1CE gene, i.e., from the coding region of 
SEQ.ID.NOs.:l, 2, or 4. 

In a particular embodiment, the DNA sample from the 
patient is cDNA that has been prepared from an RNA sample from the 
patient. In another embodiment, the DNA sample from the patient is 
genomic DNA. 

In a particular embodiment, the nucleotide sequences of the 
PCR fragment from the patient and the control PCR fragment are 
determined by DNA sequencing. 

In a particular embodiment, the nucleotide sequences of the 
PCR fragment from the patient and the control PCR fragment are 
compared by direct comparison after DNA sequencing. In another 
embodiment, the comparison is made by a process that includes 
hybridizing the PCR fragment from the patient and the control PCR 
fragment and then using an endonuclease that cleaves at any 
mismatched positions in the hybrid but does not cleave the hybrid if the 
two fragments match perfectly. Such an endonuclease is, e.g., SI. In 
this embodiment, the conversion of the PCR fragment from the patient to 
smaller fragments after endonuclease treatment indicates that the 
patient carries a mutation in the CG1CE gene. In such embodiments, it 
may be advantageous to label (radioactively, enzymatically, 
immunologicaUy, etc.) the PCR fragment from the patient or the control 
PCR fragment. 

The present invention provides a method of diagnosing 
whether a patient carries a mutation in the CGlCE gene that comprises: 
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(a) obtaining an RNA sample from the patient; 

(b) performing reverse transcription-PCR (RT-PCR) on 
the RNA sample using primers that span a region of the coding 
sequence of the CG1CE gene to produce a PCR fragment from the patient 

5 where the PCR fragment from the patient has a defined length, the 
length being dependent upon the identity of the primers that were used 
in the RT-PCR; 

(O hybridizing the PCR fragment to DNA having a 
sequence selected from the group consisting of SEQ ID NOs 1 2 and 
10 SEQ.ID.NO.:4 to form a hybrid ; 

(d) treating the hybrid produced in step (c) with an 
endonuclease that cleaves at any mismatched positions in the hybrid but 
does not cleave the hybrid if the two fragments match perfectly; 

(e) determining whether the endonuclease cleaved the 
hybrid by determining the length of the PCR fragment from the patient 
after endonuclease treatment where a reduction in the length of the PCR 
fragment from the patient after endonuclease treatment indicates that 
the patient carries a mutation in the CG1CE gene. 

The present invention provides a method of diagnosing 
whether a patient carries a mutation in the CGlCE gene that comprises: 

(a) making cDNA from an RNA sample from the 
patient; 

(b) providing a set of PCR primers based upon 
SEQ.ID.NO.:2 or SEQ.ID.NO.:4; 

25 (c) Performing PCR on the cDNA to produce a PCR 

fragment from the patient; 

(d) determining the nucleotide sequence of the PCR 
fragment from the patient; 

(e) comparing the nucleotide sequence of the PCR 
fragment from the patient with the nucleotide sequence of SEQ.ID NO 2 
or SEQ.ID.NO.r4; 

where a difference between the nucleotide sequence of the 
PCR fragment from the patient with the nucleotide sequence of 
SEQ.ID.NO.:2 or SEQ.ID.NO.:4 indicates that the patient carries a 
35 mutation in the CGlCE gene. 



20 
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The present invention provides a method of diagnosing 
whether a patient carries a mutation in the GG1CE gene that comprises: 

(a) preparing genomic DNA from the patient; 

(b) providing a set of PCR primers based upon 
SEQ.ID.NO.:l, SEQ.ID.NO.:2, or SEQ.ID.NO.:4; 

(c) performing PCR on the genomic DNA to produce a 
PCR fragment from the patient; 

(d) determining the nucleotide sequence of the PCR 
fragment from the patient; 

(e) comparing the nucleotide sequence of the PCR 
fragment from the patient with the nucleotide sequence of SEQ.ID.NO.:2 
or SEQ.ID.NO.:4; 

where a difference between the nucleotide sequence of the 
PCR fragment from the patient with the nucleotide sequence of 
SEQ.ID.NO.:2 or SEQ.ID.NO.:4 indicates that the patient carries a 
mutation in the CG1CE gene. 

In a particular embodiment, the primers are selected so 
that they amplify a portion of SEQ.ID.NOs.:2 or 4 that includes at least 
one position selected from the group consisting of: positions 120, 121, 122, 
357, 358, 359, 381, 382, 383, 783, 784, and 785. In another embodiment, the 
primers are selected so that they amplify a portion of SEQ.ED.NOs.:2 or 4 
that includes at least one position selected from the group consisting of: 
positions 384, 385, and 386. In another embodiment, the primers are 
selected so that they amplify a portion of SEQ.K).NO.:2 that includes at 
least one position selected from the group consisting of: positions 999, 
1,000, and 1,001. In another embodiment, the primers are selected so 
that they amplify a portion of SEQ.ID.NOs.:2 or 4 that includes at least 
one codon that encodes an amino acid present in CG1CE that is also 
present in the corresponding position in at least one of the C. elegans 
proteins whose partial amino acid sequence is shown in Figure 7. 

In a particular embodiment, the present invention provides 
a diagnostic method for determining whether a person carries a 
mutation of the CG1CE gene in which the G at position 383 of 
SEQ.ID.NO. :2 has been changed to a C. This change results in the 
creation of a Fnu4HI restriction site. By amplifying a PCR fragment 
spanning position 383 of SEQ.ID.NO.:2 from DNA or cDNA prepared 
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from a person, digesting the PCR fragment with Fnu4HI, and 
visualizing the digestion products, e.g., by SDS-PAGE, one can easily 
determine if the person carries the G383C mutation. For example, one 
could use the PCR primer pair 5'-CTCCTGCCCAGGCTTCTAC-3' 
(SEQ.ID.NO.:30) and 5'-CTTGCTCTGCCTTGCCTTC-3' (SEQ.ID.NO.:31) 
to amplify a 125 base pair fragment. Heterozygotes for the G383C 
mutation have three Fnu4HI digestion products: 125 bp, 85 bp, and 40 bp; 
homozygotes have two: 85 bp and 40 bp; and wild-type individuals have a 
single fragment of 125 bp. 

In a particular embodiment, the present invention provides 
a diagnostic method for determining whether a person carries a 
mutation of the CG1CE gene in which the T at position 783 of 
SEQ.ID.NO.:2 has been changed to an A. This change results in the 
creation of a PflMI restriction site. By amplifying a PCR fragment 
spanning position 783 of SEQ.ID.NO.:2 from DNA or cDNA prepared 
from a person, digesting the PCR fragment with PflMI, and visualizing 
the digestion products, e.g., by SDS-PAGE, one can easily determine if 
the person carries the T783A mutation. 

The present invention also provides oligonucleotide probes, 
based upon the sequences of SEQ.ID.NOs.:l, 2, or 4, that can be used in 
diagnostic methods related to Best's macular dystrophy. In particular, 
the present invention includes DNA oligonucleotides comprising at least 
18 contiguous nucleotides of at least one of a sequence selected from the 
group consisting of: SEQ.ID.NOs.:l, 2 and SEQ.ID.:N0.4. Also provided 
by the present invention are corresponding RNA oligonucleotides. The 
DNA or RNA oligonucleotide probes can be packaged in kits. 

In addition to the diagnostic utilities described above, the 
present invention makes possible the recombinant expression of the 
CG1CE protein in various cell types. Such recombinant expression 
makes possible the study of this protein so that its biochemical activity 
and its role in Best's macular dystrophy can be elucidated. 

The present invention also makes possible the development 
of assays which measure the biological activity of the CG1CE protein. 
Such assays using recombinantly expressed CGlCE protein are 
especially of interest. Assays for CGlCE protein activity can be used to 
screen libraries of compounds or other sources of compounds to identify 
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compounds that are activators or inhibitors of the activity of CG1CE 
protein. Such identified compounds can serve as "leads" for the 
development of pharmaceuticals that can be used to treat patients 
having Best's macular dystrophy. In versions of the above-described 
assays, mutant CG1CE proteins are used and inhibitors or activators of 
the activity of the mutant CG1CE proteins are discovered. 
Such assays comprise: 

(a) recombinantly expressing CG1CE protein or mutant 
CG1CE protein in a host cell; 

(b) measuring the biological activity of CG1CE protein or 
mutant CG1CE protein in the presence and in the absence of a substance 
suspected of being an activator or an inhibitor of CG1CE protein or 
mutant CG1CE protein; 

where a change in the biological activity of the CG1CE 
protein or the mutant CG1CE protein in the presence as compared to the 
absence of the substance indicates that the substance is an activator or 
an inhibitor of CG1CE protein or mutant CG1CE protein. 

The present invention also includes antibodies to the 
CG1CE protein. Such antibodies may be polyclonal antibodies or 
monoclonal antibodies. The antibodies of the present invention are 
raised against the entire CG1CE protein or against suitable antigenic 
fragments of the protein that are coupled to suitable carriers, e.g., 
serum albumin or keyhole limpet hemocyanin, by methods well known 
in the art. Methods of identifying suitable antigenic fragments of a 
protein are known in the art. See, e.g., Hopp & Woods, 1981, Proc. Natl. 
Acad. Sci. USA 78:3824-3828; and Jameson & Wolf, 1988, CABIOS 
(Computer Applications in the Biosciences) 4:181-186. 

For the production of polyclonal antibodies, CG1CE protein 
or an antigenic fragment, coupled to a suitable carrier, is injected on a 
periodic basis into an appropriate non-human host animal such as, e.g., 
rabbits, sheep, goats, rats, mice. The animals are bled periodically and 
sera obtained are tested for the presence of antibodies to the injected 
antigen. The injections can be intramuscular, intraperitoneal, 
subcutaneous, and the like, and can be accompanied with adjuvant. 

For the production of monoclonal antibodies, CG1CE 
protein or an antigenic fragment, coupled to a suitable carrier, is 
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injected into an appropriate non-human host animal as above for the 
production of polyclonal antibodies. In the case of monoclonal 
antibodies, the animal is generally a mouse. The animal's spleen cells 
are then immortalized, often by fusion with a myeloma cell, as described 
in Kohler & Milstein, 1975, Nature 256:495-497. For a fuller description 
of the production of monoclonal antibodies, see Antibodies: A Laboratory 
Manual, Harlow & Lane, eds., Cold Spring Harbor Laboratory Press, 
1988. 

Gene therapy may be used to introduce CGlCE polypeptides 
into the cells of target organs, e.g., the pigmented epithelium of the 
retina or other parts of the retina. Nucleotides encoding CGlCE 
polypeptides can be ligated into viral vectors which mediate transfer of 
the nucleotides by infection of recipient cells. Suitable viral vectors 
include retrovirus, adenovirus, adeno-associated virus, herpes virus, 
vaccinia virus, and polio virus based vectors. Alternatively, nucleotides 
encoding CGlCE polypeptides can be transferred into cells for gene 
therapy by non-viral techniques including receptor-mediated targeted 
transfer using ligand-nucleotide conjugates, lipofection, membrane 
fusion, or direct microinjection. These procedures and variations 
thereof are suitable for ex vivo as well as in vivo gene therapy. Gene 
therapy with CGlCE polypeptides will be particularly useful for the 
treatment of diseases where it is beneficial to elevate CGlCE activity. 

The present invention includes DNA comprising 
nucleotides encoding mouse CGlCE. Included within such DNA is the 
DNA sequence shown in Figure 8A-C (SEQ. ID. NO.:28). Also included 
is DNA comprising positions 11-1,663 of SEQ. ID. NO.:28. Also included 
are mutant versions of DNA encoding mouse CGlCE. Included is DNA 
comprising nucleotides that are identical to positions 11-1,663 of SEQ. 
ID. NO.:28 except that at least one of the nucleotides at positions 26-28, 
positions 263-265, positions 287-289, positions 689-691, and/or positions' 
905-907 differs from the corresponding nucleotide at positions 26-28, 
positions 263-265, positions 287-289, positions 689-691, and/or positions 
905-907 of SEQ. ID. NO.:28. Particularly preferred versions of mutant 
DNAs are those in which the nucleotide change results in a change in 
the corresponding encoded amino acid. The DNA encoding mouse 
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CG1CS can be in isolated form, can be substantially free from other 
nucleic acids, and/or can be recombinant DNA. 

The present invention includes mouse CG1CE protein (SEQ. 
ID. NO.:29). This mouse CG1CE protein can be in isolated form and/or 
5 can be sustantially free from other proteins. Mutant versions of mouse 
CGlCE protein are also part of the present invention. Examples of such 
mutant mouse CG1CE proteins are proteins that are identical to SEQ. 
ID. NO.:29 except that the amino acid at position 6, position 85, position 
93, position 227, and/or position 299 differs from the corresponding 

10 amino acid at position 6, position 85, position 93, position 227, and/or 
position 299 in SEQ. ID. NO.:29. 

cDNA encoding mouse CGlCE can be amplified by PCR 
from cDNA libraries made from mouse eye or mouse testis. Suitable 
primers can be readily designed based upon SEQ. ID. NO.:28. 

15 Alternatively, cDNA encoding mouse CGlCE can be isolated from cDNA 
libraries made from mouse eye or mouse testis by the use of 
oligonucleotide probes based upon SEQ. ID. NO.:28. 

In situ hybridization studies demonstrated that mouse 
CGlCE is specifically expressed in the retinal pigmented epithelium 

20 (see Figure 10). 

By providing DNA encoding mouse CGlCE, the present 
invention allows for the generation of an animal model of Best's 
macular dystrophy. This animal model can be generated by making 
"knockout" or "knockm* mice containing altered CGlCE genes. 

25 Knockout mice can be generated in which portions of the mouse CGlCE 
gene have been deleted. Knockin mice can be generated in which 
mutations that have been shown to lead to Best's macular dystrophy 
when present in the human CGlCE gene are introduced into the mouse 
gene. In particular, mutations resulting in changes in amino acids 6, 

30 85, 93, 227, or 299 of the mouse CGlCE protein (SEQ.ffhNO.:29) are 

contemplated. Such knockout and knockin mice will be valuable tools in 
the study of the Best's macular dystrophy disease process and will 
provide important model systems in which to test potential 
pharmaceuticals or treatments for Best's macular dystrophy. 

35 Methods of producing knockout and knockin mice are 

well known in the art. For example, the use of gene-targeted ES cells 
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in the generation of gene- targeted transgenic knockout mice is 
described in, e.g., Thomas et aL, 1987, Cell 51:503-512, and is 
reviewed elsewhere (Frohman et aL, 1989, Cell 56:145-147; Capecchi, 
1989, Trends in Genet. 5:70-76; Baribault et aL, 1989, Mol. Biol. Med. 
5 6:481^92). 

Techniques are available to inactivate or alter any 
genetic region to virtually any mutation desired by using targeted 
homologous recombination to insert specific changes into 
chromosomal genes. Generally, use is made of a "targeting vector," 

10 i.e., a plasmid containing part of the genetic region it is desired to 
mutate. By virtue of the homology between this part of the genetic 
region on the plasmid and the corresponding genetic region on the 
chromosome, homologous recombination can be used to insert the 
plasmid into the genetic region, thus disrupting the genetic region. 

15 Usually, the targeting vector contains a selectable marker gene as 
well. 

In comparison with homologous extrachromosomal 
recombination, which occurs at frequencies approaching 100%, 
homologous plasmid-chromosome recombination was originally 

20 reported to only be detected at frequencies between 10-6 and 10-3 (Lin 
et aL, 1985, Proc. Natl. Acad. Sci. USA 82:1391-1395; Smithies et aL, 
1985, Nature 317: 230-234; Thomas et aL, 1986, Cell 44:419-428). 
Nonhomologous plasmid-chromosome interactions are more 
frequent, occurring at levels 105-fold (Lin et aL, 1985, Proc. Natl. 

25 Acad. Sci. USA 82:1391-1395) to 102-fold (Thomas et aL, 1986, Cell 
44:419-428) greater than comparable homologous insertion. 

To overcome this low proportion of targeted 
recombination in murine ES cells, various strategies have been 
developed to detect or select rare homologous recombinants. One 

30 approach for detecting homologous alteration events uses the 

polymerase chain reaction (PGR) to screen pools of transformant 
cells for homologous insertion, followed by screening individual 
clones (Kim et aL, 1988, Nucleic Acids Res. 16:8887-8903; Rim et aL, 
1991, Gene 103:227-233). Alternatively, a positive genetic selection 

35 approach has been developed in which a marker gene is constructed 
which will only be active if homologous insertion occurs, allowing 
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these recombinants to be selected directly (Sedivy et al., 1989, Proc. 
Natl. Acad. Sci. USA 86:227-231). One of the most powerful 
approaches developed for selecting homologous recombinants is the 
positive-negative selection (PNS) method developed for genes for 
5 which no direct selection of the alteration exists (Mansour et al., 
1988, Nature 336:348-352; Capecchi, 1989, Science 244:1288-1292; 
Capecchi, 1989, Trends in Genet. 5:70-76). The PNS method is more 
efficient for targeting genes which are not expressed at high levels 
because the marker gene has its own promoter. Nonhomologous 

10 recombinants are selected against by using the Herpes Simplex virus 
thymidine kinase (HSV-TK) gene and selecting against its 
nonhomologous insertion with herpes drugs such as gancyclovir 
(GANC) or FIAU (l-(2-deoxy 2-fluoro-B-D-arabinofluranosyl)-5- 
iodouracil). By this counter-selection, the percentage of homologous 

15 recombinants in the surviving transformants can be increased. 

The following non-limiting examples are presented to better 
illustrate the invention. 

20 EXAMPLE 1 

Identification of the huma n CG1CE gene and cDNA cloning 

Construction of Libraries for Shotgun Sequencing 

Bacterial strains containing the BMD PACs (PI Artificial 
Chromosomes) were received from Research Genetics (Huntsville, AL). 

25 The minimum tiling path between markers D11S4076 and UGB that 
represents the minimum genetic region containing the BMD gene 
includes the following nine PAC clones: 363M5 (140 kb), 519013(120 kb), 
527E4 (150 kb), 688P12 (140 kb), 741N15 (170 kb), 756B9 (120 kb), 759J12 (140 
kb), 1079D9 (170 kb), and 363P2 (160 kb). Cells were streaked on Luria- 

30 Bertani (LB) agar plates supplemented with the appropriate antibiotic. 

A single colony was picked up and subjected to colony-PCR analysis with 
corresponding STS primers described in Cooper et al., 1997, Genomics 
41:185-192 to confirm the authenticity of PAC clones. A single positive 
colony was used to prepare a 5-ml starter culture and then 1-L overnight 
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culture in LB medium. The cells were pelleted by centrifugation and 
PAC DNA was purified by equilibrium centrifugation in cesium 
chloride-ethidium bromide gradient (Sambrook, Fritsch, and Maniatis, 
1989, Molecular Cloning: A Laboratory Manual, second edition, Cold 
5 Spring Harbor Laboratory Press). Purified PAC DNA was brought to 50 
mM Tris pH 8.0, 15 mM MgCl2, and 25% glycerol in a volume of 2 ml 

and placed in a AERO-MIST nebulizer (CIS-US, Bedford, MA). The 
nebulizer was attached to a nitrogen gas source and the DNA was 
randomly sheared at 10 psi for 30 sec. The sheared DNA was ethanol 

10 precipitated and resuspended in TE (10 mM Tris, 1 mM EDTA). The 
ends were made blunt by treatment with Mung Bean Nuclease 
(Promega, Madison, WI) at 30°C for 30 min, followed by 
phenol/chloroform extraction, and treatment with T4 DNA polymerase 
(GIBCO/BRL, Gaithersburg, MD) in multicore buffer (Promega, 

15 Madison, WI) in the presence of 40 uM dNTPs at 16°C. To facilitate 
subcloning of the DNA fragments, BstX I adapters (Invitrogen, 
Carlsbad, CA) were ligated to the fragments at 14°C overnight with T4 
DNA ligase (Promega, Madison, WI). Adapters and DNA fragments 
less than 500 bp were removed by column chromatography using a 

20 cDNA sizing column (GIBCO/BRL, Gaithersburg, MD) according to the 
instructions provided by the manufacturer. Fractions containing DNA 
greater than 1 kb were pooled and concentrated by ethanol precipitation. 
The DNA fragments containing BstX I adapters were ligated into the 
BstX I sites of pSHOT II which was constructed by subcloning the BstX I 

25 sites from pcDNA II (Invitrogen, Carlsbad, CA) into the BssH II sites of 
pBlueScript (Stratagene, La Jolla, CA). pSHOT II was prepared by 
digestion with BstX I restriction endonuclease and purified by agarose 
gel electrophoresis. The gel purified vector DNA was extracted from the 
agarose by following the Prep- A- Gene (BioRad, Richmond, CA) protocol. 

30 To reduce ligation of the vector to itself, the digested vector was treated 
with calf intestinal phosphatase (GIBCO/BRL, Gaithersburg, MD. 
Ligation reactions of the DNA fragments with the cloning vector were 
transformed into ultra-competent XL-2 Blue cells (Stratagene, La Jolla, 
CA), and plated on LB agar plates supplemented with 100 |ig/ml 

35 ampicillin. Individual colonies were picked into a 96 well plate 

containing 100 |il/well of LB broth supplemented with ampicillin and 
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grown overnight at 37°C. Approximately 25 |il of 80% sterile glycerol 
was added to each well and the cultures stored at -80°C. 

Preparation of plasmid DNA 
5 Glycerol stocks were used to inoculate 5 ml of LB broth 

supplemented with 100 |ig/ml ampicillin either manually or by using a 
Tecan Genesis RSP 150 robot (Tecan AG, Hombrechtikon, Switzerland) 
programmed to inoculate 96 tubes containing 5 ml broth from the 96 
wells. The cultures were grown overnight at 37°C with shaking to 

10 provide aeration. Bacterial cells were pelleted by centrifugation , the 

supernatant decanted, and the cell pellet stored at -20°C. Plasmid DNA 
was prepared with a QIAGEN Bio Robot 9600 (QIAGEN, Chatsworth, 
CA) according to the Qiawell Ultra protocol. To test the frequency and 
size of inserts, plasmid DNA was digested with the restriction 

15 endonuclease Pvu II. The size of the restriction endonuclease products 
was examined by agarose gel electrophoresis with the average insert 
size being 1 to 2 kb. 

DNA Sequence Analysi s of Shotgu n clones 

20 DNA sequence analysis was performed using the ABI 

PRISM™ dye terminator cycle sequencing ready reaction kit with 
AmpliTaq DNA polymerase, FS (Perkin Elmer, Norwalk, CT). DNA 
sequence analysis was performed with M13 forward and reverse 
primers. Following amplification in a Perkin-Elmer 9600, the extension 

25 products were purified and analyzed on an ABI PRISM 377 automated 
sequencer (Perkin Elmer, Norwalk, CT). Approximately 4 sequencing 
reactions were performed per kb of DNA to be examined (384 sequencing 
reactions per each of nine PACs). 

30 Assembly of DNA sequent 

Phred/Phrap was used for DNA sequences assembly. This 
program was developed by Dr. Phil Green and licensed from the 
University of Washington (Seattle, WA). Phred/Phrap consists of the 
following programs: Phred for base-calling, Phrap for sequence 

35 assembly, Crossmatch for sequence comparisons, Consed and 
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Phrapview for visualization of data, Repeatmasker for screening 
repetitive sequences. Vector and E. coli DNA sequences were identified 
by Crossmatch and removed from the DNA sequence assembly process. 
DNA sequence assembly was on a SUN Enterprise 4000 server running 
5 a Solaris 2.51 operating system (Sun Microsystems Inc., Mountain 
View, CA) using default Phrap parameters. The sequence assemblies 
were further analyzed using Consed and Phrapview. 

T fl? ntifinatifm of new micro s atftllite genetic markers from the Best's 

10 macular dvstronhv region 

Isolation of CA microsatellites from PAC-specific 
sublibraries, Southern blotting and hybridization of PAC DNA with a 
(dC-dA) n (dG-dT) n probe (Pharmacia Biotech, Uppsala, Sweden) was 
used to confirm the presence of CA repeats in nine PAC clones that 

15 represent a minimum tiling path. Shotgun PAC-specific sublibraries 
were constructed from DNA of all 9 PAC clones using a protocol 
described above. The sublibraries were plated on agar plates, and 
colonies were transfered to nylon membranes and probed with randomly 
primed polynucleotide, (dC-dA) n (dG-dT) n , Hybridization was 

20 performed overnight in a solution containing 6X SSC, 20 mM sodium 
phosphate buffer (pH 7.0), 1% bovine serum albumin, and 0,2% sodium 
dodecyl sulfate at 65°C. Filters were washed four times for 15 min each 
in 2X SSC and 0.2% SDS at 65°C. CA-positive subclones were identified 
for all but one PAC clone (527E4). DNA from these subclones was 

25 isolated and sequenced as descrobed above for the shotgun library 
clones. 

Identification of simple repeat sequences in assembled 
DNA sequences. DNA sequence at the final stage of assembly was 
checked for the presence of microsatellite repeats using a Consed 
30 visualization tool of the Phred/Phrap package. 

Polymorphism analysis and reco mbination mapping 

Sequence fragments containing CA repeats were analyzed 
using the PRIMER program; oligonucleotide pairs fl ankin g each of the 
35 CA repeats were synthesized. The forward primer was kinase-labeled 
with [gamma-32p]-ATP. Amplification of the genomic DNA was 
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peformed in a total volume of 10 El containing 5 ng/El of genomic DNA; 
10 mM Tris-HCl pH 8.3; 1.5 mM MgCl2 ; 50 mMKCl; 0.01% gelatin; 200 

EM dNTPs; 0.2 pmol/El of both primers; 0.025 unit/El of Taq polymerase. 
The PCR program consisted of 94° C for 3 min followed by 30 cycles of 
5 94°C for 1 min, 55°C for 2 min, 72°C for 2 min and a final elongation step 
at 72°C for 10 min. Following amplification, samples were mixed with 2 
vol of a formamide dye solution and run on a 6% polyacrylamide 
sequencing gel. Two newly identified markers detected two 
recombination events in disease chromosomes of individuals from 
10 family Si. This limited the minimum genetic region to the interval 
covered by 6 PAC clones: 519013, 759J12, 756B9, 363M5, 363P2, and 
741N15. 

Identification of the retina-specific EST hit in the pCA759112-2 clone. 

15 A CA-positive subclone (pCA759J12-2) was identified in the 

shotgun library generated from the PAC 759J12 DNA by hybridization to 
the (dC-dA)n (dG-dT)n probe. DNA sequence from pCA759J12-2 was 
queried against the EST sequences in the GenBank database using the 
BLAST algorithm (S.F. Altschul, et al., 1990, J. Mol. Biol. 215:403-410). 

20 The BLAST analysis identified a high degree of similarity between the 
DNA sequence obtained from the clone pCA759J12-2 and a retina- 
specific human EST with GenBank accession number AA3 18352. 
BLASTX analysis of EST AA318352 revealed a strong homology of the 
corresponding protein to a group of C. elegans proteins with unknown 

25 function (RFP family). The RFP family is known only from C. elegans 
genome and EST sequences (e.g., C. elegans C29F4.2 and B0564.3) and is 
named for the amino acid sequence RFP that is invariant among 15 of 
the 16 family members; members share a conserved 300-400 amino acid 
sequence including 25 highly conserved aromatic residues. 

30 A human gene partially represented in pCA759J12-2 and 

EST AA3 18352 was dubbed CG1CE (Candidate Qene #1 with the 
homology to the Q. glegans group of genes) and selected for detaled 
analysis. 
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RinTnfnmatic Analysis of Assemble d DNA Sequences 

When the assembled DNA sequences from the nine BMD 
PACs ajgeaached 0.5-1-fold coverage, the DNA contigs were randomly 
concateiafiad, and prediction abilities of the program package AceDB 
5 were utfised to aid in gene identification. 

In addition to the DNA sequence generated from the nine 
PACs HH^ioned above, Genbank database entries for PACs 466A11 and 
363P2 ^EteeBank accession numbers AC003025 and AC003023, 
respecfisse^jt) were analyzed with the use of the same AceDB package. 

10 PAC c1&ibr466A11 and 363P2 represent parts of the PAC contig across 
the BM®negion (Cooper et al., 1997, Genomics 41:185-192); both clones 
map -to He minimum genetic region containing the BMD gene that was 
determiarfby recombination breakpoint analysis in a 12-generation 
SweaishB**igree (Graff et al ., 1997, Hum. Genet. 101: 263-279). Datbase 

15 entries &r PACs 466A11 and 363P2 represent unordered DNA pieces 
generea%s£in Phase 1 High Throughput Genome Sequence Project 
(HTGSiSase 1) by Genome Science and Technology Center, University 
of Texas Southwestern Medical Center at Dallas. 

20 cDNA sflMmence and e xon/intron organization of the CG1CE gene 

Genomic DNA sequences from PACs 466A11 and 759J12 
were coiaipared with the CG1CE cDNA sequence from EST AA318352 
using th® program Crossmatch which allowed for a rapid and sensitive 
detection «f the location of exons. The identification of intron/exon 

25 boundaries was then accomplished by manually comparing visualized 
genomic and cDNA sequences by using the AceDB package. This 
analysis allowed the identification of exons 8, 9, and 10 that are 
represented in EST AA318352. To increase the accuracy of the analysis, 
the DNA sequence of EST AA3 18352 was verified by comparison with 

30 genomic sequence obtained from pCA759J12-2, PAC 466A11, and 

shotgun PAC 759J12 subclones. The verified EST AA318352 sequence 
was reanalyzed by BLAST; two new ESTs (accession numbers AA307119 
and AA205892) were found to partially overlap with EST AA318352. They 
were assembled into a contig using the program Sequencher (Perkin 

35 Elmer, Norwalk, CT), and a consensus sequence derived from three 

ESTs (AA318352, AA307119, and AA205892) was re-analyzed by BLAST. 
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BLAST analysis identified a fourth EST belonging to this cluster 
(accession number AA3 17489); EST AA3 17489 was included in the 
consensus cDNA sequence. The consensus sequence derived from the 
four ESTs (AA318352, AA307119, AA205892, and AA317489) was 
compared with genomic sequences obtained from pCA759J12-2, PAC 
466A11, and shotgun PAC 759J12 subclones using the programs 
Crossmatch and AceDB. This analysis verified the sequence and 
corrected sequencing errors that were found in AA318352, AA307119, 
AA205892, and AA317489. Comparison of cDNA and genomic sequences 
revealed a total of 7 exons. The order of the exons from 5' end to 3' end 
was 5 , -ex4-ex5-ex6-ex8-ex9-exl0-exll-3\ BLASTX analysis of the 
genomic segment located between exons 6 and 8 in PAC 466A11 revealed 
strong homology of the corresponding protein to a group of C. elegans 
proteins (RFP family). Since there were no EST hits in the GenBank 
EST database that covers this stretch of genomic sequence, this part of 
the CG1CE gene was called exH (Hypothetical ex 7). This finding 
changed the order of exons in the CG1CE gene to 5'-ex4-ex5-ex6-ex7-ex8- 
ex9-exl0-exll-3\ The BLAST analysis of the DNA region located 
upstream of the exon 4 identified an additional human EST (AA326727) 
with a high degree of similarity to genomic sequence. Comparison of 
DNA and genomic sequences revealed the presence of two additional 
exons (exl and ex2) in the CG1CE gene. This finding changed the order 
of the exons in the CG1CE gene to 5'-exl-ex2- ex4-ex5-ex6-ex7-ex8-ex9- 
exlO-exll-3'. Bioinformatic analysis did not allow the prediction of 
boudaries between exons 2 and 4, exons 6 and 7, and exons 7 and 8. In 
addition, there was no overlap between ESTs represented in exons 1 and 
2 from one side and exons 4, 5, 6, 7, 8, 9, 10, and 11 from another. There 
was the possibility of the presence of additional exons in the CG1CE gene 
that were not represented in the GenBank EST database. 

Identification of an additional evnn and determination nf tha reart. 
exon/intron hmmHariPs within th* CGlCE ran* 

To identify additional exon(s) within the CG1CE gene and 
verify the exonic composition of this gene, forward and reverse PCR 
primers from all known exons of the CGlCE gene were synthesized and 
used to PCR amplify CGlCE cDNA fragments from human retina 
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"Marathon-ready" cDNA (Clontech, Palo Alto, CA). In these RT-PCR 
experiments forward primer from exl (LF: 

CTAGTCGCCAGACCTTCTGTG) (SEQ.ID.NO.:9) was paired with a 
reverse primer from ex4 (GR: CTTGTAGACTGCGGTGCTGA) 
5 (SEQ.ID.NO.:10), forward primer from ex4 (GF: 

GAAAGCAAGGACGAGCAAAG) (SEQ.ID.NO.-.ll) was paired with a 
reverse primer from ex6 (BR: AATCCAGTCGTAGGCATACAGG ) 
(SEQ.ID.NO.:12), forward primer from ex6 (EF: 

ACCTTGCGTACTCAGTGTGGA ) (SEQ.ID.NO.:13) was paired with a 
10 reverse primer from ex8 (AR: TGTCGACAATCCAGTTGGTCT) 

(SEQ.ID.NO.:14), forward primer from ex8 (AF: 

CCCTTTGGAGAGGATGATGA) (SEQ.ID.NO.:15) was paired with a 

reverse primer from exlO (CR: CTCTGGCATATCCGTCAGGT) 

(SEQ.ID.NO/.16), forward primer from exlO (CF: 
15 CTTCAAGTCTGCCCCACTGT) (SEQ.ID.NO.:17) was paired with a 

reverse primer from exll (DR: GCATCCCCATTAGGAAGCAG) 

(SEQ.ID.NO.-.18). 

A 50 (J.1 PCR reaction was performed using the Taq Gold 
20 DNA polymerase (Perkin Elmer, Norwalk, CT) in the reaction buffer 

supplied by the manufacturer with the addition of dNTPs, primers, and 
approximately 0.5 ng of human retina cDNA. PCR products were 
electrophoresed on a 2% agarose gel and DNA bands were excised, 
purified and subjected to sequence analysis with the same primers that 
25 were used for PCR amplification. The assembly of the DNA sequence 
results of these PCR products revealed that: 

(i) exons 1 and 2 from one side and exons 4, 5, 6, 7, 8, 9, 
10, and 11 indeed represent fragments of the same gene 

(ii) an additional exon is present between exons 2 and 4 

30 (named ex3) 

(iii) exon 7 (Hypothetical) predicted by the BLASTX 
analysis is present in the CG1CE cDNA fragment amplified by EF/AR 

primers. T»m? 

Comparison of the DNA sequences obtained from RT-PCK 

35 fragments with genomic sequences obtained from pCA759J12-2, PAC 

466A11, and shotgun PAC 759J12 subclones was performed using the 
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programs Crossmatch and AceDB. This analysis confirmed the 
presence of the exons originally found in five ESTs (AA318352 
AA307U9 AA205892, AA317489, and AA326727) and identified an 
additional exon (exon3) in the CG1CE gene. Exact sequence of 
exon/intron boundaries within the CGlCE gene were determined for all 
of the exons. The splice signals in all introns conform to publish 
consensus sequences. The CGlCE gene appears to span at least 16 kb ot 
genomic sequence. It contains a total of 11 exons. 

Two snbVft Honor s ites for intron 7 ^ 

Two splicing variants of exon 7 were detected upon 
sequence analysis of RT-PCR products amplified from human retina 
cDNA with the primer pair EF/AR. Two variants utilize alternative 
splice donor sites separated from each other by 203 bp. Both splicing 
sites conform to the published consensus sequence. 

^fi^on «f . * » ^ y .nH. of CGlCE cDNA 

RACE is an established protocol for the analysis of cDNA 
ends. This procedure was performed using the Marathon RACE 
template from human retina, purchased from Clontech (Palo Alto, CA). 
cDNA primers KR ( CT AAGC GGGC ATT AGCC ACT) (SEQ.ID.NO/.19) 
and LR(TGGGGTTCCAGGTGGGTCCGAT) (SEQ.ID.NO.-.20) in 

combination with a cDNA adaptor primer API 
(CCATCCTAATACGACTCACTATAGGGC ) (SEQ.ID.NO.:21) were 

used in 5'RACE. cDNA primer DF 

(GGATGAAGC AC ATTCCTAACCTGCTTC ) (SEQ.ID.NO.:22) in 
combination with a cDNA adaptor primer API 

(CCATCCTAATACGACTCACTATAGGGC ) (SEQ.ID.NO,21) was used 
in 3'RACE Products obtained from these PCR amplifications were 
) analyzed on 2% agarose gels. Excised fragments from the gels were 
purified using Qiagen QIAquick spin columns and sequenced using 
ABI dye-terminator sequencing kits. The products were analyzed on 
ABI 377 sequencers according to standard protocols. 
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EXAMPLE 2 

nop fp macular Hvst m phv is associated with miifrtionp in an 
^YA^irinnarilv c onserved region of CGlCE 

Genomic DNA from BMD patients from two Swedish 
5 pedigrees having Best's macular dystrophy (families SI and SL76) was 
amplified by PCR using the following primer pair: 
exGJeft AAAGCTGGAGGAGCCGAG (SEQ.ID.NO.:23) 
exG.right CTCCACCCATCTTCCGTTC (SEQ.ID.NO.:24) 
This primer pair amplifies a genomic fragment that is 412 bp long and 
10 contains exon4 and adjacent intronic regions. 
The patients were: 

Family SI: 

Sl-3, a normal individual, i.e., not having BMD; sister of Sl-4 
Sl-4, an individual heterozygous for BMD; and 
15 Sl-5, an individual homozygous for BMD. 

Patients Sl-4 and Sl-5 had the clinical symptoms of BMD, including 
morphological changes observable upon ophthalmologic examination. 
Family SL76: 

SL76-3, an individual heterozygous for BMD; mother of SL76-2 
20 SL76-2, an individual heterozygous for BMD, son of SL-3. 

PCR products produced using the primer sets mentioned 
above were amplified in 50 pi reactions consisting of Perkin-Elmer 10 x 
PCR Buffer, 200 mM dNTP's, 0.5 ul of Taq Gold (Perkin-Elmer Corp., 
Foster City, CA), 50 ng of patient DNA and 0.2 EM of forward and 
25 reverse primers. Cycling conditions were as follows: 

1. 94°C lOmin 

2. 94°C 30 sec 

3. 72°C 2 min (decrease this temperature by 1.1°C per cycle) 

4. 72°C 2 min 

30 5. Go to step 2 15 more times 

6. 94°C 30 sec 

7. 55°C 2 min 

8. 72°C 2 min 

9. Go to step 6 24 more times 
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10 



15 



20 



25 



10. 72°C 7 min 

11 4 C Products obtained from this PCR amplification were 
analyzed on 2% agarose gels and excised fragments from the gels were 
purified using Qiagen QIAquick spin columns and sequenced using 
ABI dye-terminator sequencing kits. The products were analyzed on 
ABI 377 sequencers according to standard protocols. 

The results are shown in Figure 6. Figure 6 shows a 
chromatogram from sequencing runs on the PCR fragments from 
patients 81-8, 814, and Sl-5. The six readings represent sequencing of 
both strands of the PCR fragments from the patients. As can be seen 
from Figure 6, the two patients affected with BMD, patients Sl-4 and 81- 
Tboth Zry a mutation at position 383 of SEQ.ID.NO ; 2. Both copies of 
the CGICE gene are mutated in homozygous affected Sl-5, while 
heterozygous affected Sl-4 contains both normal and mutated copies of 
the CGICE gene. This mutation changes the codon that encodes the 
amino acid at position 93 of SEQ.ID.NO,3 from TGG (encoding 

tryptophan) to TGC (encoding ^^^^1 disease 
individual, has the wild-type sequence, TGG, at this coaon. 
mutation that changes this TGG codon to a TGC codon was not found 
^ sequencing of 50 normal unrelated individulas (100 chromosomes) 

of North American descent. 

Both patients from family SL76 carry a mutation at position 
357 of SEQ ID NO.:2. This mutation changes the codon that encodes the 
amino acid at position 85 of SEQ.ID.NO,3 from TAC (encoding tyrosine) 
to CAC (encoding histidine). This disease mutation that changes to 
TAC codon to a CAC codon was not found upon sequencing of 50 normal 
unrelated individulas (100 chromosomes) of North American descent. 

Amino acid positions 85 and 93 of the CGICE protein are 
, evolutionary conserved. Figure 7 demonstrates that position 93 is 
occupied by tryptophan not only in the CGICE protein but alsoin^ 15of 
related C. elegans proteins. The lone C. elegans protein in whxch this 
residue is not tryptophan contains an Afunctional phenylalanine 
instead Phenylalanine and tryptophan, both being hydrophobic 
5 aromatic amino acids, are highly similar. Position 85 is occupied by 
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tyrosine and isofunctional phenylalanine in all 16 related C. elgans 
proteins. Phenylalanine and tyrosine, both being aromatic amino acids, 
are highly similar. 



EXAMPLE 3 



RT-PCR: RT-PCR experiments were performed on "quick- 
clone" human cDNA samples available from Clontech, Palo Alto, CA 
cDNA samples from heart, brain, placenta, lung, liver, skeletal muscle, 
10 kidney, pancreas, and retina were amplified with primers AF 
(CCCTTTGGAGAGGATGATGA) (SEQ.ID.NO/.15) and CR 
(CTCTGGCATATCCGTCAGGT) (SEQ.ID.NO.:16) in the following PCR 

conditions: 

1. 94°C lOmin 

15 2. 94°C 30 sec 

3. 72°C 2 min (decrease this temperature by 1.1°C per cycle) 

4 72°C 2 min 

5. Go to step 2 15 more times 

6. 94°C 30 sec 
20 7. 55°C 2 min 

8. 72°C 2 min 

9. Go to step 6 19 more times 

10. 72°C 7 min 

11. 4°C . 

25 The CG1CE gene was found to be predominantly expressed in human 

retina and brain 

Northern blot analysis: Northern blots containing poly(A+)- 
RNA from different human tissues were purchased from Clontech, Palo 
30 Alto, CA. Blot #1 contained human heart, brain placenta, lung, nver, 
skeletal muscle, kidney, and pancreas poly(A + )-RNA. Blot #2 contamed 
stomach, thyroid, spinal cord, lymph node, trachea, adrenal gland, and 
bone marrow poly(A+)-RNA. 
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Primers <EF (CTTCAAGTCTGCCCCACTGT) (SEQ.ID.NO.:17) and 
exC_ri#ft (TAGGCTCAGAGCAAGGGAAG) (SEQ.ID.NO/.25) were 
used toamplify a PCR product from total genomic DNA. This product 
waspuriBed on an agarose gel, and used as a probe in Northern blot 
5 hybridi^on. The probe was labeled by random priming with the 

Amersto Rediprime kit (Arlington Heights, ID in the presence of 50- 
100 nGirfSOOO Ci/mmole [alpha 32 P ]dCTP (Dupont/NEN, Boston, MA). 
UnincoiB^ated nucleotides were removed with a ProbeQuant G-50 spin 
column SBSiarmacia^Biotech, Piscataway, NJ). The radiolabeled probe 
10 at* com^tration of greater than 1 x 106 cpm/ml in rapid hybridation 
fcuffer»>ntech, Palo Alto, CA) was incubated overnight at 65 C. ine 
blots «m washed by two 15 min incubations in 2X SSC, 0.1% SDS 
(prepare from 20X SSC and 20 % SDS stock solutions, Fisher, 
Fittsbia*, PA) at room temperature, followed by two 15 min 
15 incuba&ns in IX SSC, 0.1% SDS at room temperature, and two 30 nun 
incubaW in 0.1X SSC, 0.1% SDS at 60°C. Autoradiography of the blots 
was done to visualize the bands that specifically hybridized to the 

radiolabeled probe. 

The probe hybridized to an mRNA transcript that is 

20 uniquely expressed in brain and spinal cord. 

Mouse probe for the murine ortholog of the GClCE gene 
was generated based on the sequence of an EST with GenBank accession 
number AA497726. The 246 bp probe was amplified from mouse heart 
cDNA (Clontech, Palo Alto, CA) using the primers mouseCGlCE.L 
25 (ACACAACACATTCTGGGTGC) (SEQ.ID.NO.-.26) and 

mouseCGlCE.R (TTCAGAAACTGCTTCCCGAT) (SEQ.ID.NO.:27). 
Due to an extremely low expression level of the CG1CE gene in mouse 
heart, repetitive amplification steps were used to generate this probe. 
The authenticity of this probe was verified by sequence analysis of the gel 
30 purified DNA band. Northern blot containing poly(A + )-RNA from 
several rat tissues (heart, brain, spleen, lung, liver, skeletal muscle 
kidney, testis) was purchase from Clontech, Palo Alto, CA. The probe 
hybridized to an mRNA transcript that is expressed in testis only. 

The present invention is not to be limited in scope by the 
35 specific embodiments described herein. Indeed, various modifications 
of the invention in addition to those described herein will become 

-40- 



WO 99/43695 



PCI7US99/03790 



apparent to those skilled in the art from the foregoing description. Such 
modifications are intended to fall within the scope of the appended 
claims. 

Various publications are cited herein, the disclosures of 
which are incorporated by reference in their entireties. 
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WHAT IS CLAIMED: 

1. An isolated DNA comprising nucleotides encoding a 
polypeptide having an amino acid sequence selected from the group 

5 consisting of SEQ .ID.NO.:3, SEQ.ID.NO.:5, and SEQ.ID.NO.:29. 

2. The DNA of claim 1 comprising a nucleotide 
sequence selected from the group consisting of: SEQ.ID.NO. :1, 
SEQ ID.NO.:2, SEQ.ID.NO.:4, SEQ.ID.NO.:28, positions 105-1,859 of 

10 SEQ.ID.NO.:2, positions 105-1,409 of SEQ.ID.NO.:4, and positions 11- 
1,663 of SEQ.ID.NO.:28. 

3. An isolated DNA comprising a sequence that is 
identical to SEQ.ID.N0..2 except that it contains a differennt nucleotide 

15 at a position selected from the group consisting of positions 120, 121, 122, 
357, 358, 359, 381, 382, 383, 783, 784, 785, 999, 1000, and 1001. 

4. An isolated DNA that hybridizes under stringent 
conditions to a nucleotide sequence selected from the group consisting 

20 of: SEQ.ID.NO.-.1, SEQ.ID.NO.:2, SEQJD.NO.:4, and SEQ.ID.NO.:28. 

5. An expression vector comprising the DNA of 

claim 1. 

25 6. A recombinant host cell comprising the DNA of 

claim 1. 

7. A CG1CE protein, substantially free from other 
proteins, having an amino acid sequence selected from the group 

30 consisting of SEQ.ID.NO.: 3, SEQ.ID.NO.:5, and SEQ.ID.NO.: 29. 

8. The CG1CE protein of claim 8 containing a single 
amino acid substitution. 



35 9. The CG1CE protein of claim 9 where the substitution 

occurs at position 6, 85, 93, 227, or 299. 
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10. The CG1CE protein of claim 9 where the substitution 
is a conservative substitution. 

5 11, The CG1CE protein of claim 8 containing two amino 

acid substitutions. 

12. The CG1CE protein of claim 8 containing an amino 
acid substitution where the substitution does not occur in a position 

10 where the amino acid present in CG1CE is also present in the 

corresponding position in one of the C. elegans proteins whose partial 
amino acid sequence is shown in Figure 7. 

13. An antibody that binds specifically to a CG1CE 

15 protein where the CG1CE protein has the amino acid sequence selected 
from the group consisting of SEQ.ID.NO.:3 and SEQJD-NO.:5. 

14. A method of diagnosing whether a patient carries a 
mutation in the CG1CE gene that comprises: 

20 (a) providing a DNA sample from the patient; 

(b) providing a set of PCR primers based upon 
SEQ.ID.NO.:2 or SEQ.ID.NO.:4; 

(c) performing PCR on the DNA sample to produce a 
PCR fragment from the patient; 

25 (d) determining the nucleotide sequence of the PCR 

fragment from the patient; 

(e) comparing the nucleotide sequence of the PCR 
fragment from the patient with the nucleotide sequence of SEQ.ID.NO.:2 
or SEQ.ID.NO.:4; 

30 where a difference between the nucleotide sequence of the 

PCR fragment from the patient with the nucleotide sequence of 
SEQ.ID.NO.:2 or SEQ.ID.N0.:4 indicates that the patient carries a 
mutation in the CG1CE gene. 



35 



15. The method of claim 15 where the DNA sample is 
genomic DNA. 
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16. The method of claim 15 where the DNA sample is 

cDNA. 

17. A DNA or RNA oligonucleotide probe comprising at 
least 18 contiguous nucleotides of at least one of a sequence selected from 
the group consisting of: SEQ.ID.NO.rl, SEQ.ID.NO.:2, SEQ.ID.NO.:4, 
and SEQ.ID.NO.:28. 

18. A method for determining whether a substance is an 
activator or an inhibitor of a CG1CE protein or a mutant CG1CE protein 
comprising: 

(a) recombinantly expressing CG1CE protein or mutant 
CG1CE protein in a host cell; 

(b) measuring the biological activity of CG1CE protein or 
mutant CG1CE protein in the presence and in the absence of a substance 
suspected of being an activator or an inhibitor of CG1CE protein or 
mutant CG1CE protein; 

where a change in the biological activity of the CG1CE 
protein or the mutant CG1CE protein in the presence as compared to the 
absence of the substance indicates that the substance is an activator or 
an inhibitor of CG1CE protein or mutant CG1CE protein. 
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FIGURE 1A 



1 ccaaaaaatt gttctcttgg gggttggggc gacaagcggg aagggagggc 
51 attttgggca aattggctta ttgccacgca agggctttaa caccttaggt 
101 tggtgggttc acaggttgca ggcaacccac catggcacac gtatacctat 
151 gtaaccaacc tgcaccatca tgtataccta tgtaaccaac ctggtacatt 
201 ctgcacacgt atcccaggac tttagagtga aaaaaaaagt ggtgtgtaga 
251 aaaatcacct gcaatctcag catagttaac gcttagtaca tttcagagag 
3 01 agagggtgac aggaaaggga ggatgagagt gggtttaaga cacaaggtca 
351 tattataaaa tcagggcttc tggaagttta gtcccaaaac cacacatctc 
401 ataatcccct gcagtgcttg attaaaatgc aacatcccta aggccacaga 
451 ctcagactct ggagaaagat ccagaaaact gcccgtttaa taaacatttg 
501 ggcgattctt acggcctcta aagaccaaga accactgctg cctagagctc 
551 tgctctcttc attgaacaat acaagaggag tgtgtaggta gacacccacc 
601 acttccaaca gcttaggaga gcccttgagt atggattgat gtattaaaat 
651 ttattgaatc acatgctgag attttcacca gctgcccgtg gggatctggg 
701 catttattcc catattgcac tggctggctg gaagccagca gcataaacrc 
751 cagggctgtt ctgtcaaccc ccaccagact cacccccctc caccagcccc 
801 ggcaggcttc tccttccatc tctctgaagc aacttactga tgggccctgc 
851 cagccaatca cagccagaat aacgtatgat gtcaccagca gccaatcaga 
901 gctcctcgtc agcatatgca gaattctgtc attttactag ggtgatgaaa 
951 ttcccaagca acaccatcct tttcagataa gggcactgag gctgagagag 
1001 gagctgaaac ctacccgggg tcaccacaca caggtggcaa ggctgggacc 
1051 agaaaccagg actgttgact gcagcccggt attcattctt tccatagccc 
1101 acagggctgt caaagacccc agggcctagt cagaggctcc tccttcctgg 
1151 agagttcctg gcacagaagt tgaagctcag cacagccccc taacccccaa- 
1201 ctctctctgc aaggcctcag gggtcagaac actggtggag cagatccttt 
1251 agcctctgga ttttagggcc atggtagagg gggtgttgcc ctaaattcca 
13 01 gccctggtct cagcccaaca ccctccaaga agaaattaga ggggccatgg 
13 51 ccaggctgtg ctagccgttg cttctgagca gattacaaga agggactaag 
1401 acaaggactc ctttgtggag gtcctggctt agggagtcaa gtgacggcgg 
1451 ctcagcactc acgtgggcag tgccagcctc taagagtggg caggggcact 
1501 ggccacagag tccCAGGGAG rrgrftrr^ r cTAGTecerA GACCTTrrer 
1551 SSGATCATCO SAUTftrfTg gaaccccacc tgtgagtaca aggtgcccca 
1601 ggtggactgg gctggggctt tgaggccttc agggttggat ggccatcttg 
1651 cgtatttgtg tgggatatgc acacacaggc agcacatgcg caggtgtgtg 
1701 ggcacctgtg tgtctgtgca aatgccctga ggtgggaatg agcttggtgt 
1751 gcatcaggag cgacagccag ccagtgtggc tgcagcaaaa cacacaggga 
1801 aagaatggag ggggcatcaa tcactgacaa aattatttat agagctcccc 
1851 ctaaaaaaaa gaaggtctct tctttcgata gaagaaggga gagagggggt 
1901 ttgtccttat aaatataagg gaggagccgc ccctcaaaaa ataagggagg 
1951 gaggacccaa gaccccgtgg gttgtgtgtt ttccaggggg agctcgaacc 
2001 ctttagaggg agcgtgggag aaccgctgta ttcaggcctc tcgagagaaa 
2051 aggagcggcc gcccaaaaaa tatccctccc gggcgataag aaatggtggc 
2101 ctctctcaaa aagatgaaga ggaagccgga gttgtatgtg ttgatatttt 
2151 taaaactcca ggtagnnnnn nnnnntgctt cagtaaattt ttattgagcg 
2201 ccttctacga gaacacaaga ggagcttcca ttctgaggag gaaacaggca 
2251 ggaaacaggc agatatcctg tataatttca agtagtgata agtgctctct 
23 01 agaaatatca agcaaggtga ggagacacag agcaccggtg gcagtggggc 
2351 tctatttcca ggttggatgg ttgggaacat cctttctaaa gggaacctgg 
2401 agtgggaagg aaccatgcag gtatctcagg aagagcttcc tccaggcagg 
2451 aagatcagca ggtggaaagg ccctggagcc accattcagt aaacatcatt 
2501 tgagcatctc taccagctag gttccattat gggaatggga atatggtggt 
2551 ggacagggct gcctggtccc ttccatactt ctcacactag ggtggttgag 
2601 agagcttggg agctaacgaa caagatgggc tgagaacact gcctagccca 
2651 gaggacctga gcttagtgtg tagacattgc tgctgttact gcctttgtcg 
2701 ttgtattatt tatttattta tttattgatc ttaagacaga gttttgctct 
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FIGURE IB 
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tcttacccag 
cacctcctgg 
gattacaggc 
gagacagggt 
aggtgatcca 
gccactgcgc 
cgcctatcta 
tttcttgcgt 
agaccacttc 
agcagggaag 
acccccaatc 

CCCACTGCCT 



gcttgagtgc 
gatcaagcga 
acccgcacca 
ttcaccatgt 
cctgcctcga 
ccagtgatta 
cgtcttccct 
ttctacttcc 
atccacctcc 
ggtcctgaca 
ggtgtccctc 



aatggcgtga 
ttctcctgcc 
cgcctggata 
tggccaggct 
cttcccaaag 
tagaaagtta 
gccaaagcaa 
aaaaggcagt 
tagggtccct 
ggctctgacc 
tctaccaoGA 
ATC&PTT&rfi 



tctcagctca 
tcagcctcct 
atttttttgt 
ggtctcgaac 
tgctgggatt 
aaggcacatg 
agggcagcct 
cagaactggc 
atgggagagt 
agggcctctg 
CCCAAGCCCA 



ctgcaacctc 
gagtagctgg 
atttttagta 
tcctgacctt 
ataggcatga 
gcaatgcaca 
ctgggctcac 
agggccttgg 
tgaggtccag 
atccctacaa 



CAAGCCAAGT GGrT&ETHrr 



CSCTTMffiTT CCTTCTcrrn ccrcrrer^. TC r^rr^. GCAcr&rrT* 
CAAGCTGCTA TATrerr^pT 



TCCGCTTTAT 
aaggatgtgg 
agctcagggc 
accagctcct 
gtaaagaagt 
gccacccttc 
ggccccactc 
ccacatagta 
gagtctcact 
tccccctgcc 
gcacttgacc 
cagggtttca 
ctcactgtaa 
tcctgagtag 
tttttttttc 
gctggtcttg 
aagtgctggg 
ttttttaaaa 
tgactcgcgt 
acttgagcct 
ccaaaaattt 
agctacttgg 
ggctgcagtg 
ctatctcaaa 
cacagaatat 
ggattgtaaa 
gcaacaaggc 
agttagggcc 
ggccaaggca 
atagggagat 
ccctttggct 
ggatcatgag 
aatctctact 
tagttccagc 
aggcagaggc 
tgacagagtg 
tgggcatggt 
aagcgggagg 
actgtgccgc 
aacaaacaaa 
ttatctaaac 



2Z&X&£gtaa 
ctggggctgg 
ccagtgcacc 
gggcactgga 
cacactgaga 
ctccaacccc 
tactggcctg 
cattaaaaaa 
gtgttgtcca 
ttagcctccc 
aaccacatgg 
ctccatcacc 
cctctgcctc 
ctggaattat 
tgtattttta 
aacccctgac 
attacaggtg 
ttatttttta 
ctgtaatccc 

gggagttcag 

aaaaaattag 
gaagctgagg 
agctatgatc 
agcaaacaaa 
atgatagcat 
atatcaaata 
acatttggtt 
agccacaggg 

ggaggatcac 

cctgatcttg 
tacacccgta 
gtcaggagtt 
ataaatacaa 
tactcaggag 
tgcagtgagc 
agactccgtc 
ggcttatgcc 
attgcttcag 
tgcccttgag 
caaacaaaca 
aataaaataa 



TCCTAATTTT rc-mcrr^r TArT&r&rr* 



agctggcagg 
gagctgggag 
agtccactac 
gctgaggctg 
ggctgctcaa 
aggaggaccc 

ttttactgaa 
gagagagaga 
ggctggtctc 
aaggggctgg 
tacttttttt 
caggctggag 
ccaggtgcaa 
aggcacacac 
gtagagacag 
ctcaagtgat 
tcagccacca 
attaaaatgt 
agcactttga 
cgtgggcaac 
ctgggagtgg 
tgtggggatg 
acaccactgc 
ataatgttta 
tttaaattga 
catgaaattc 
tttactaggg 
gctcacacct 
ttgagcccag 
tctctataaa 
atcccagcac 
caagaccagc 
aaattagccg 
gatgaggccg 
cgagaccatg 
ttaaaataat 
tgtagtccca 
cccaggaggt 
cctgggtaac 
aacaaacaaa 
aggacagata 



gctgggccgg 
ctcctggggg 
aacactaagc 
cgcgctgggg 
gccaggccag 
ctggagccca 
tcccacacag 
gagagagaga 
gaactcctag 
gattacaggt 
tttttttttt 
tgcagtgggg 
gcgattctcc 
caccacgcct 
ggtttcatca 
ccacccacct 
tgcacagccc 
ttatctaagg 
ggggccaagg 
atagtgagac 
tggcatttgc 
gctgaagcct 
acttcagcct 
tctaaacggt 
aaaagcatta 
ttgtgttctt 
caccaaggta 
gtaatcccag 
gagtttagga 
aaattaaaaa 
tttgggaggc 
ctggccaaca 
agtggggtgg 
gagaatcgct 
ccattgcact 
attaaaatct 
cccagctctt 
tgaggctgca 
agagcaagac 
aaccaataaa 
taatcaccga 



ggggcctggg 
cctcccagcc 
tgggctcctg 
gctgggcaga 
cagggtttta 
ggctttgtct 
actcataggc 
gagagagatg 
gctcaagcaa 
gtgagctact 
ttttttgaga 
gcaatcttgg 
tgccttagcc 
ggctaatttt 
tgttggccag 
cggcctccca 
acatggtaca 
ccagtagcag 
tgcggggatc 
cccgtctcta 
ctgtggtccc 
gtgaggtcga 
gagtgacagg 
aaggtataat 
atgattacat 
aataatgcta 
ctttaaaaaa 
cactttggga 
cctgagcaac 
attggctagg 
cgaggcgggt 
tagtgaaccc 
cacgcacctg 
tgagcccggg 
ccagcctagg 
taaaatgatc 
caggaggctg 
gtgagtcatg 
cctatctcaa 
ccaaaaacat 
atatatgata 



WO 99/4M9S 



3 / 18 



PCT/US99/03790 



PICURK 1C 
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gcattttaaa 
aaatacataa 
ggtctttggt 
attttttaga 
ttctcatgtg 
aacaccccca 
tgacacatca 
tgggcagtac 
atcacagcat 
tattcacccc 
agagacagtg 
agcccattgc 
cctccagtgg 
tttgtagaga 
ggtcctgcct 
tgcccgtccc 
ggatgcattc 
cttccttcct 
gaaaccactg 
agcatcatgg 
atgtgggata 
cctggggaca 
ccacccccac 

GAGAAAPTf^ 



ttgaaaaagc 
aattcttaag 
ttttacttgg 
gtagttttag 
tctctttgct 
cactacagtg 
ctatcaccca 
attccatggg 
caggcagagt 
tctcattaaa 
tctcgctctg 
agcctccaac 
ctacgactgc 
tagggtcttg 
tagcctccca 
aaacactctg 
aaaggatcag 
acacatctcc 
gagggggcct 
acctggctca 
gcatcgaggc 
gtctcagcca 
cccca aGCTG 

CTCTGTATTY* 



actaatgact 
ttcctcctaa 



CTTCGTCtPTTt £gtgagttcc 



gcaccaatgc 
gttcacagca 
cctccccctg 
gtagatttat 
aagttcatag 
tttggataaa 
agtttcactg 
gccaaacact 
tcaaccaggc 
tcctgggctc 
aggcatacgg 
ctatgttgcc 
gagctctggg 
tttcgacctg 
ggtgtctgaa 
cagtggccag 
cctcctgtcc 
ggcctcagga 
agtcccactc 
tctcctcgct 
GCCCTCirnn 



acaatggatt 

taccaaatac 

atgctgaaaa 

aaattgagca 

cccccagcct 

tacaatccct 

cgtacagcag 

tgtgtaatga 

ctctaacaaa 

ctgtttcctt 

tgaagtgcaa 

aagtgatcct 

caacggcacc 

caggctggtc 

attacaggcg 

cttttaaaca 

actggcctct 

tgtgaggatt 

gggtttgggg 

ggggccctgg 

ctacccaggg 
gcgtccacac 



ataaaacatc 
aaagcacatt 
agagtcgttc 
gaaggtagag 
ccccactatc 
gaacccacag 
ggttcactct 
tgtctccacc 
atcctctgcc 
ttttcctttt 
tggcaatcac 
cctatctcag 
caactaattt 
ttgaactctt 
tgaaccaccg 
actgaccctt 
gcagcaggac 
ctccccacaa 
ctgtacaagg 
gctggggaaa 
ccgggctaga 
aattccaccc 
GCTGATGTTT 



gcccaggctc 
ggctggggag 
agggaaaggt 
agacgtcctg 
acatcatggt 
tttctgggac 
cgcgcctcca 
ggagggctgg 
gcatcgccgg 
ccctcgcccc 
ACCCGCTGGT 



cagacaggcc 
ggggcggggg 
gcggactgca 
ccgttagcaa 
ccctggagcc 
cagcaggggg 
tgcgaggctc 
gggctaggcc 
gcgctgggcc 
ccgcccctcc 

GGAACCAGTft 



cccttctggc 
aggggaggat 
aacgccagcg 
gccagagaaa 
tgaaaacccc 
cctgcgcggg 
acccccgggt 
tgcctgcctc 
cgctcgcagc 
ctgggctctg 
tgcccag££J 



ATCCAGCTPA TccmTrrr 



tgttccgggt 
cacgaggagc 
gcaggtcggc 
ctgaagttag 
attttctgag 
aggggagggg 
gacagaaccc 
tcgctcccga 
agaaagctgg 
gccgcagcct 
TCTArnTr.&r 



ccctgtggcc 
tgcggcaagg 
gcctctctgt 
acgttaggta 
ggaagcgctg 
gtctggcgga 
ttggggctct 
gcgccttcca 
aggagccgag 
ggcccctcgc 
GCTSSTCfiTff 



.CCGTGGCCCG ACCGrrTTAT 



gCWCTOCTg TCg^CTTTO TTnAAGGCAA r-r.irv^ CCAA mrrfMrnvar 

TCCTgCTCAC gCTCATrror TAttrr^nr Tannr^nr^ gctcatcc™ 
CgCAGCGTTft GCACCGrAnT cr*r**r.nr.n rrrrr nnrn crrar^rrr 



GGTGCAAGPA 
ccaggggccg 
tcccccggac 
atttgggggt 
gaggccagga 
acccttgagg 
tactagggtc 
actctggagg 
acctggtcct 
cctgtaatcc 
cagctgtttg 
aaaacattaa 
ctgagtatcg 
aggctgcagt 
agccagaccc 
cccagaacag 



iSgtgggcgga 
agatgggcgc 
tcgggggact 
ccaattgggc 
gcccaccctc 
gataatggaa 
tacttccctc 
tatgggacat 
ggttaataag 
cagtgcttta 
agacgcccct 
aaattagcag 
ggaggctgag 
gcgctaagat 
tttctctgga 
cacctagtag 



ccgggagcaa 

ggcaggaacg 

gggtggagcc 

gggacagagt 

cgagagtagg 

agaagggtga 

tgcccttgcc 

tggtctctga 

acagacccag 

ggaggcaaag 

gagcaacata 

ggcatggtgg 

gcaggaggat 

cgcaccgctg 

aataaataaa 

gtgctcagaa 



cggggaggca 
gaagatgggt 
aggagtgggg 

<=gggtgtctg 

agtctgaggc 
cggcttggga 
cctcttgatc 
caccccctca 
gctaggcgtg 
gtgggaagat 
gcgagacccc 
cgtgtgcctg 
cacttgagcc 
cactccaacc 
taccctgccc 
atttttttgt 



ccgggcagag 

ggagccaaag 

tgtggtcaag 

aaggtggggc 

agggctaagg 

actggtgagg 

tccggtttcc 

gcctggcctg 

gtggctctcg 

cgcttgagcc 

catctctaca 

tagtctgagg 

cagcagttcc 

tcggtgacag 

acatgctcag 

tgttgaaaga 
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«It™? ff ! aaaSgagt9 ct ^ggttcc tataggtcag caggtgccgg 
M ™ tgcaggttct cccacccacc gccttcttca ctccactctg 
GACTCCTOGPft flMrininr »«™~»nnn ^"'Tii 
CCACACAftrft TGTTCTTtGGT nrrrrr^ Trn^ J l 

GAAGGCOTGG CTTT^nn^ n n ^ crJz* r^TT ^ 



TGCTGAACar 
gttgtggtcc 
ctgagggtct 
gcaatccaca 
caagtctgtg 
caaagaggtt 
aggtaccagg 

ggggcaggtg 



gagcccactg 
Acaggaaaca 
tccgagagcc 
gcccgaggtg 
aggtcctggt 
tagtgagctt 
ccctggtacc 
gtgttcagaa 
gCETACTTftn 



AGTATrrrAc Tr^ T ^ Tfir 



tacagacagg 
aggtttccta 
ggaggtgggg 
gtcccttatc 
tcccttttga 
cccatggcca 
tggagaagag 
ccccatcccc 



gctgccgcag 
caaagagaag 
ttgcagaatc 
agaggcccct 
tagatgagga 
cacagccagg 

gtgggggcga 

ctcttctgcc 



*gtgggaagg 
ccttgggccc 
ttttccaaca 
ccctcttctc 
agctgagaca 
aatggaccat 
gcccagggtg 
cccca aGAGA 



cttttgggaa 

gggctcacct 

gaagttgggt 

gcctggccca 

ggttggccgg 

ctgaggcagg 

acatggtgaa 

gtgcacgcct 

ttgaacccgg 

tccagcctgg 

aacaaacaaa 

agtgtgcaag 

cctggagcat 

cctcctcctc 



actgaggcta 

agaggctaag 

ctggactttg 

gtccagtaga 

gcacagtggc 

tggatcacct 

accccatctc 

gtaatcccag 

gaggtggagg 

gcgacacagc 

caaacaaaca 

tcagaacaag 

cctgatttca 

ctcctcctcc 



ACAGgtgagg 

gaaggaccaa 

tggctcccct 

aagtgccaag 

ggcaatgtga 

tcatgcctgt 

gaggtcagga 

tactgaaaat 

ctacttggga 

ttgcagtgag 

aagactctgt 

aaggggttaa 

gccttggtct 

gggttcccac 

tcccaa GTGG 



TGTATGCCTA cr^r^. m 



actaggctgg 

ggaagcagct 

gggagttggg 

ttctaagagt 

ttatccccat 

aatcccagca 

gttcgagacc 

acagaattag 

ggctgaggca 

ctgagatcat 

ctcaaacaaa 

cagagcccct 

cctgtctcag 

ctagcccttt 

TGACTCZTcirzr 



tgaggctgcc 

ggggtgggaa 

tccacacttt 

ccaggctcct 

attaaagaga 

ctttgggaag 

agcctggcca 

ctgtgtggtg 

ggagaatcgc 

gccactgcac 

caaacaaaca 

aagtcacata 

actcccagcc 

gctaccacat 

GSTCTftrftfrr 



CTACCCTSgc gATGftnrTQG ACCTrn>nn T ^crrn-rr^r h cG>rrcc-vnr 

SS^SSI TrS?^ rVr^g 

ctgwct^* <*XMrwrA GAGaanrr^ ^ C AG ^r ty^^^ 

^rarea** aw***™- n rr y mm ^ affrn i fTfr 

CQT&TGGCG r^cCTGTA ATrrrAarr* 
GAATCGCT7K AACrr^n p^™^ 



actgcactcc 
acaacaacaa 
cagagcgaac 
gtagctgtcc 
atttcttcaa 
tcctaattta 
tacagagaga 
aaaaacttta 
nnnagatgtt 
tccttctgtt 
ttggtaatgg 
acacacacac 
ctctaaattc 
acaaatcaca 
caagatgtcc 
tttgttaggg 
tgccctttag 



agcctgggca 

aacaaagccc 

actctcctat 

agtattctcc 

ctcttaattc 

tgaatgggtt 

gagaaagatc 

ttaaatcagg 

ctgaatcaga 

gcccacccac 

gggtgtaagt 

acacacacac 

cccctgcacc 

cttttatgct 

cctggacccc 

cattttagag 

ttcagcccag 



aaagaatgaa 

taaggttcag 

taagatgctg 

acacagcata 

ctcctttgtg 

agtatgctct 

tatcttaatc 

caagtaaaat 

gagttttctc 

tctctctccc 

ctctgtctct 

acacacacac 

cccagttatc 

tgaaattctc 

taaggcagac 

gttgctatcc 

cttcagtata 



ISgtgagttg 
actctatctc 
aagcccctgc 
ttgggtgtct 
atcgacagat 
ccaccatttt 
gcttctgcat 
ccgccccatt 
ccgccaagga 
tcgagctctt 
ttcctacctt 
gcccttcctg 
acacacacac 
tttggtttct 
cagggtgccc 
gcgtgtcacc 
aggaatctgc 
tatctctgtt 



agatcgtgcc 
aaaaacaaca 
cctttagaag 
ttttcactca 
tctaatacaa 
ttcttctacc 
tgagacaaaa 
ttagttggaa 
ttgnnnnnnn 
tatctttcct 
cctttatttt 
tcactgtgac 
attcctattc 
gcagatcaaa 
cagtggcctg 
tcttcggggc 
ccacctagac 
gcatgaatga 
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PI TOE is 



10901 
10951 
11001 
11051 
11101 
11151 
11201 
11251 
11301 
11351 
11401 
11451 
11501 
11551 
11601 
11651 
11701 
11751 
11801 
11851 
11901 
11951 
12001 
12051 
12101 
12151 
42201 
12251 
12301 
12351 
12401 
12451 
12501 
12551 
12601 
12651 
12701 
12751 
12801 
12851 
12901 
12951 
13001 
13051 
13101 
13151 
13201 
13251 
13301 
13351 
13401 
13451 
13501 
13551 



ataaaattat gcaactccag 
gactcagccg agtgatacac 
actggctcag aagagttaga 
acaagtgtgg ggggctggag 
aggcaggaag ggcttcatgg 
agggggaagc tggctttgag 
cctgtcccca a oGTGGCAG^ 

TGATCATTTT GAftftrCAftrT gga^^ 0^^. 



gtaagataca 
tcagggacag 

ggggctgtgt 

ccctaaactc 
ggtgtggaaa 
gagttctgcc 



tgaggtgaga 
ctgtgggtgt 
ccagaagtgt 
tgcctttgaa 
tagcagcagc 
tgagggttta 
AACCCcttty: 



taaaggcagt 
tcagggaagg 
gtgggtgcct 
gacagtggtc 

tgaggtttaa 

cagagcctca 
GAGACGATG& 



gagagggaga 
gagctcctcc 
atctttgagg 
aactgaggtc 
caaggtcctg 
Qaccao GTGT 
GATGGAGrrr. 



gaaaccatac 
ctcctgcagc 
ctgcaggcag 
cagagagagg 
cctgggatga 



catggacctt 
cagtcattca 
gcacccatct 
gagagattcc 
tctttctgtg 



ccccaaagtg 
ctcacaggat 
ccccatttca 
tccaagtcat 
ggacttcttc 



£ASgtatggg 
gacccaaaga 
tctcacctca 
caggcaggga 
caggcacata 
tgtccctggt 
ACCTCPrrrr, 



GACATTtTACT GGAATAAOrr cninmr^ r crrrmr* 



CA<?QTgCTTr CgCCrAGTTr CGTcr.*r.rr T crr^r^^n c?cc*rr<T*rr 



AAC ATC AGgf t 
tgcaggggtc 
ttgcttcagt 
tatttttttc 
attaaagtac 
caaatggtgc 
ggaacgttag 
tttgagacag 
tcttggctca 
tcagcctccc 
aatttttgta 
gtctccaact 
ggaattatag 
tgggaagtgg 
cagcaggcag 
ccgagtaaag 
gtcccacttc 
aagggctatc 
gccttggaga 
ccacggtatc 
aacggagttt 
cggctcactg 
gcctcctgag 
ttttgtattt 
gaactcctga 
gattacatgt 
aatatcctac 
gaggaatggt 
cagaaacatt 
gcagctgaag 
gcatagacct 
ccaagtctca 
agggacagaa 
ccagctgggt 
ggcaagggca 
aggtctgggg 
gtggatgtca 
cactcatggg 



gtggccagag 
tgcctaggaa 
aagtgtcagg 
ctcccaataa 
aggttcagag 
atttgctact 
gacctggctc 
tatctcgctc 
ctgcaacctc 
cagtagctgg 
cttttagtag 
cctgaccagt 
gtgtcaaaac 
aagtggggtt 
ccaggccatc 
ggctcaggcc 
cctgattcca 
ccagctggtc 
gtgttgggca 
cagtgctgtt 
cactcttgtt 
caacctccgc 
tagctgggat 
ttagtagaga 
cctcaggtga 
gtgagccact 
tagactgcaa 
tgggaaggtc 
tctggaggat 
gttgttgagg 
tgtctccaag 
aactctggat 
catggaacac 
ctggagctga 
ggccatactc 
ctcccgggat 
ctcccagttg 
cctcatctga 



ccagggggct 
cttagaatag 
cactgtacta 
ttctggtttg 
agagtaagtt 
cgaaggacag 
ttgtcatcca 
tgtcgcccag 
cgcctcctgg 
gattacaggt 
agatgaggtt 
aatctgcccg 
tatgttttct 
ccctgggatg 
acaggtacct 
acccacagca 
tctgaatccc 
ctttctcccc 
catgtcaggg 
ctcgcttgtt 
gcccagagct 
ctcccagatt 
tataggtgcc 
cagtttcacc 
tccaccctcc 
gtgcctggct 
tcgagtttaa 
atcaaatgaa 
gactttgagc 
gatggggagg 
gaatgcacaa 
acaaggtaca 
agtcatcttt 
gccatggaac 
tctggtagat 
gcctgttgct 
gaaccacaaa 
accactcatg 



gggtgggaag 
cactagttaa 
tgctctttat 
ttatcccaag 
gtccaaggcc 
cctatgatca 
gaactatgtt 
gttggagcgc 
gttcaagtga 
gcccacaacc 
tcaccatgtt 
ctttggcctc 
gataagctac 
ggggaggggc 
cctgaattga 
gccagactta 
tcttgagctg 
aggacaacag 
ttcatactca 
cttttctttt 
ggagtgcagt 
caagcaattc 
agccaccaag 
atgttggcca 
tcagcctccc 
gcttgttctt 
ctacagtcta 
ggctggaggc 
cctacatggt 
gctgaaaaca 
tttatggagg 
aagtactgga 
gtctgcctgg 

at gggaagaa 

aagctttcct 
aggaagtcaa 
ttcctggcat 
ccagggcacc 



cccctcctag 
tgcatacagg 
aaacattaac 
ttttcagata 
acatagctac 
gtgatgcagt 
ttcttttctt 
agtggcgtga 
ttctcctgct_ 
acaactggct 
ggccaggctg 
ccaaaatgct 
gatgcttgga 
agcaaagtcc 
ctttgtccta 
tccccacatg 
cagtgggctg 
agttgaaagt 
agggtttctt 
ttttttttta 
ggcataatct 
tcctgcctca 
cccggctaat 
ggctggtctc 
aaagtgctgg 
ttaagaacca 
tagatactgt 
ttgcttaggt 
ctgtacccca 
gaacgataaa 
gagctcaaac 
tgtccagaaa 
gaggcggctt 
tctgaacttg 
tgcagggtaa 
atttctcttt 
tgcccagagt 
agtgtttctg 
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FIGURE IF 
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14951 
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15201 

15251 

15301 

15351 

15401 

15451 

15501 

15551 

15601 

15651 

15701 

15751 

15801 

15851 

15901 

15951 

16001 

16051 

16101 



actgcctgga 
ttacacgcca 
gagggaatca 
aggtacagta 
gacggtgtgg 
ggagaagtaa 
tctcctgttt 

ATCAGGAG<7A 



gtgaggggtt 
ggcggggtgg 
acaaacagtg 
cagatcagga 
ccttggcttg 
ggccaggtgt 
ctttccaoCC 
CGAGGap-p.^ 



ttacagggga 
ttgcgggggt 
aggtgagctg 
gagaggtgag 
ggccaactga 
tggtcctttg 
TSAACAAAGA 



agtgaatgat. 
tggatgttaa 
ggcctggagg 
agctggggca 
gagagaggag 
tccactggct 
gCAGATCGAg 



gaggaggcct 
ctctggtcaa 
gatcaccggg 
tggtgaggaa 
cgggggtaag 
cagccctgca 
TTCCAgCCCA 



GCTCACGCTG GPATPATTGG crr.rTwrv* 



ggCCTBCAfiT CCCATGATPA CCATCCTrrr Afmr^ura» c aaggapp&a 
ACTACTGTGG CCCAAOAGGG AATPppttpt rr ACC ^r rtttrmm 
ACCACAAggC AGCCAAACAG AAPGTTAOP^ g ccaggaaga caapaaggpp 
TggAAGCTTA AGGCTGTGGA cgcpttp a ag Tmrr m c tctatpagap, 
gCCAgggTAC TAgAgTGCCC CACAGACGCP crTr^rrrr a ctpppatgt 
TCTTCCCCCT AGAACrftTrA GCGCCGTPAA AGrrrrir^. tgtpapaggp 
ATA GACA CCA AACACftAAAg CTTAAAGftCT g^h^ ^ tg gp^ppaa^al 
AAGTTTTGAA TTGCTPTPAG AGAappATCn nr,rr m ^ gagpapppak 
AAGTATCTCA AGTGAGGAGG AAAAPtgtgg AGTr^p fl acct gapp^ataty: 
CCAGAGATCC CCGAAAATCA cptpaaagaa ppttty^ aac aatpappaap 
CAACATACAC ACTACAPTCA aagatpa pat ggatppttat tgggppttw: 



AAAAC AGo t. g 
cccaccccag 

agggttccat 

ggggtatata 

ccttttctca 

tgaaggaaga 

agggctgaca 

ttactttgag 

ggatgacaga 

actacaggaa 

ggtgaggttg 

gaacctcacc 

acaaaatcag 

aattataaac 

tcccttttct 

ctattatgat 

gctaagacag 

tcatatttaa 

aggaggtcag 

tctaccaaaa 

ccaacgcagg 

tgcagtgaga 

gtctcaaaaa 

caacattttg 

tactttcatt 

GGGGATGPTT 



tgtcctccac 

cttcccttgc 

cactgccaga 

cttggccacc 

cttcaccctg 

tgaggttgtg 

ggccaggctt 

caagggtggc 

tgaacacttc 

agggtggcag 

agggtgtcca 

aaaatacttc 

atatttccct 

accccacttc 

ggattctcaa 

tgaaacctta 

gaacttggca 

gaatcttgtc 

gagtttgaga 

aaaatacaaa 

aggttgaggg 

ttgagcaact 

aaaaaaaaaa 

gtatttgaaa 

ctcactaaGG 

CGCCAGrrirz 



ctgaaccagg 

tctgagccta 

gcacactgga 

ttcacaggga 

gtatcacccg 

ctgaccagaa 

agctgagcag 

tgacccaaaa 

ccccataact 

gaactgcctc 

gcgcccttag 

ttgcttcctt 

ttattccaga 

agccccaatc 

gcagttactt 

aaagggcaac 

aacatctgtg 

ttgggctggg 

ccaacctggc 

tcagctggcc 

gagaattgct 

gcaatccagc 

aggatcgtct 

tgaaggtacc 

ATGAAGCA^A 



ggcactgcat 
cccttcctcc 
cctacgccca 
tcctagggaa 
gaagacttct 
tgctgctgga 
atgttatcac 
ccatgaggtg 
atttagggta 
actcctagga 
gtcattttct 

ggggtcagcc 

tttcctggac 
acgtgggagg 
tcacgggtca 
aatttcantc 
gcctgttcag 
tgtggaggca 
caacatgatg 
gtcgtggtgt 
tgaacccagg 
ctgggcgacg 
caacctttgc 
ttccatactt 

TTCP T A APPT 



tgccctgtgc 
acaatttcct 
gcactggctt 
gtgttcggga 
tgggaccagg 
gaactgcccc 
tggccccaac 
gcagtcagct 
gtacccaagc 
actggtagat 
cactgcctgg 
caaagctgtc 
actgtcaccc 
aagtgtaact 
gaacacgcag 
ttgcttctag 
caaaggatgt 
agtgaatcac 
aaaccccatc 
gcctgtagtc 
aggtggtggt 
gagtgagact 
cctcctactg 
atgctgttaa 
gCTTCCTA&I 



GTCCTCACCT fyimvyrarar PAGPAGGAPA 



CTGATCCAGT CACAGCCATA cagptgtppa captyiaap^ a cgtgtpptap 
AACAGCCTgA ATCAAAT^^ tagpttaata cata^ aaatc ppagaptact 
TCAGCCTTTA ATGCCTTTTA ttcataaaaa ctgtgaaap-p tagactgaac 
CATTOGAAAC ATTTAAPTPA gaptptt^at Trar.a ^ CG gaapppttag 
TTCTATCTGA ATCCAAGAPA GCCAPAPPTT ACT&T&rn oe ppaaaptaat 
GAGTTTAATA AATAPAaat* CTCGT ( SEQ . ID . NO . : 1 ) 
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FIGURE 2 

CAGGGAGTCCCACCAiGCCTAGTCGCCAGACCTTCTGTGGGATCATCGGAC 50 
CCACCTGGAACCCCACCTGACCCAAGCCCACCTGCTGCAGCCCACTGCCT 100 
GGCCATGACGATCACTTACACAAGCCAAGTGGCTAATGCCCGCTTAGGCT 150 
CCTTCTCCCGCCTGCTGCTGTGCTGGCGGGGCAGCATCTACAAGCTGCTA 200 

TATGGCGAGTTCTTAATCTTCCTGCTCTGCTACTACATCATCCGCTTTAT 250 

^ATAGGCTGGCCCTC^CGGAAGAACAACAGCTGATGTTrcAGAAACTGA 300 

CTCTGTATTGCGACAGCTACATCCAGCTXTATCCCCATTTCCrTCGTGCTG 350 

GGCTTCTACGTGACX3CTGGTCGTGACCCGCTGGTGGAACCAGTACX3AGAA 400 

CCTGCCGTGGCCCGACCGCCTCATGAGCCTGGTGTCGGGCTTCGTOGAAG 450 

GCAAGGACGAGCAAGGCCGGCTGCTGCGGCGCACGCTCATCCGCTACGCC 500 

AACCTGGGCAACGTGCTCATCCTGCGCAGCGTCAGCACCGCAGTCTACAA 550 

GCGCITCCXXAGCGCCCAGCACCTGGTGCAAGCAGGCTTTATGACTCCGG 600 

CAGAACACAAGCAGTTGGAGAAACTGAGCCTACCACACAACATGTTCTGG 650 

GTGCCCTGGGTGTGGTTTGCCAACCTGTCAATCAAGGCGTGGCTTGGAGG 700 

TCG AATCCGGG ACX^CTATCCTGCTCC AG AGCCTGCTG AACG AG ATG AAC A 750 

CCTTGCGTACTCAGTGTGGACACCTGTATGCCTACGACTGGATTAGTATC 800 

CXTACTGGTGTATACACAGGTGGTCACIGTGGCGGTGTACAGCTIT7TTCCT 850 

GACTTGTCTAGTTGGGCGGCAGTTTCTGAACCCAGCCAAGGCCTACCXrTG 900 

GCC ATG AGCTGG ACCTCGTTGTGC(XGTCTTC ACGTTCCIGCAGTTCTTC 950 

TTCTATGTTGGCTGGCTGAAGGTGGCAGAGCAGCrcATCAACCCCTTTGG 1000 

AGAGGATGATGATGATTTTGAGACX^ACTGGATTGTCGACAGGAATTTGC 1050 

AGGTGTCCCTGTTGGCTGTGGATGAGATGCACCAGGACCTGCCTCGGATG 1100 

GAGCCGGACATGTACTGGAATAAGCCCGAGCCACAGCCCCCCTACACAGC 1 150 

TGCTTCCGCCCAGTTCCGTCGAGCCTCCTTTATGGGCTCCACXTTCAACA 1200 

TCAGCCTGAACAAAGAGGAGATGGAGTTCCAGCCCAATCAGGAGGACGAG 1250 
GAGGATGCTCACGCTGGCATCATTGGCCGCTTCCTAGGCCTGCAGTCCCA 
TGATCACCATCCTCCCAGGGCAAACTCAAGGACCAAACTACTGTGGCCCA 
AGAGGGAATCCCTTCTCCACGAGGGCCTCCCCAAAAACCACAAGGCAGCC 
AAACAGAACGTTAGGGGCCAGGAAGACAACAAGGCCTGGAAGCTTAAGGC 
TGTGGACGCCTTCAAGTCTGGCCCACTGTATCAGAGGCCAGGCTACTACA 
GTGCCCCACAGACGCCCCTCAGCCCCACTCCCATGTTCTTCCCCCTAGAA 
CCATCAGCGCCGTCAAAGCTTCACAGTGTCACAGGCATAGACACCAAAGA 
CAAj\AGCITAAAGACTGTGAGTTCTGGGG<XAAGAAAAGTTTTGAATTG^ 

TCTCAGAGAGCGATGGGGCCTTGATGGAGCACCCAGAAGTATCTCAAGTG 1700 

AGGAGGAAAACTGTGGAGTTTAACCTGACGGATATGCCAGAGATCCCCGA 1750 

AAATCACCTCAAAGAACCTTTGGAACAATCACCAACCAACATACACACTA 1 800 

CACTCAAAGATCACATGGATCCTTATTGGGCXnTGGAAAACAGGGATGAA 1850 

GCACATTCCTAACCTGCTTCCTAATGGGGATGCTTCGCCAGCCAGGTCCT 1900 

CACCTGTGTGTACACCAGCAGGACACTGATCCAGTCACAGCCATACAGCT 1950 

GTCCACACTGAAGAACGTGTCCTACAACAGCCTGAATCAAATGGTTAGCT 2000 

TAATAGATAAAAATCCCAGACTACTTCAGCCTTTAATGCCTTTTATTCAT 2050 

AAAAACTGTGAAAGCTAGACTGAACXrATTGGAAACATTTAACTCAGACTC 2100 

TGGATTCAGAGTCGGGAACCCTTAGTTCTATCTGAATCCAAGACAGCCAC 2150 
ACCTTAGTATACTGCCCAAACTAATGAGTTTAATAAATACAAATACTCGT 
TAAAAAAAAAAAAAAAAAAAAAAAAAAAA (SEQJD.NOu2) 
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FIGURE 3 



15C 

_ _ 20C 

mnSS!^^^^ 300 



^SAQ^VQAGFNTIPAEHKqWi^^ ]S 



j?E?5lfI^^ 350 

MV C ««:™v.^ - 40Q 

^Mrrvclr^A^vc^ YYSAP ^ TPLSPTPMFFPLEPSAPSKLHSVTGIDTKDK 500 



jFL Y QRPG YYS APQTPLSPTPMFFPLEPS APSKLH S VTG TIYTTC m 
HLKEPI^QSPTMHTTl.KDHMDPYWAl£NliE^irs ^S^nSS) 
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50 
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CAGGGAGTCCCACCAGCCTAGTCGCCAGAOCTTCTGTGGGATCATCGGAC 
CCACCTGGAACCCCACCTCACCCAAGCCCACCTGCTCCAGCCCACTGCCT 
GGCO^TGACCATCACTTACACAAGCC\AGTGGCTAATGCCXX}CrTAGGCT 
CCTTCTCCCGCCTpCrGCTGTGCTGGCGG GGCAGC ATCTACAAGCTGCTA 
TATGGCGAGTTCTTAATCTrCCTGCTCTGCTACTACATCATCCGCTTTAT 

CTCTGTATTGCGACAGCTACATCCAGCTCATCCCX^ATTTCCTTCGTGCTG 
SSH^^ 400 
CCTGCCGTGGCCCGACCGCCTCATGAGCCTGGTGTCGGGCTTCGTOGAAG 450 
GCAAGGACGAGCAAGGCCGGCTCCTGCGGCGCACGCTCATCCGCTACGCC "~ 
AACCTGGGCAACGTGCTCATCCTGCGCAGCGTCAGCACCGCAGTCTACAA 
GCGCTTCCCCAGCGCCCAGCACCTGGTGCAAGCAGGCTTTATGACTCCGG 
^^S^^^ AG7TOAG ^^ AG ^ A CCACACAACATCTTXr^ 

GTGCCCrGGGTGTGGTTTGCCAACCTGTCAATGAAGGCGTGGCTTGGAGG 
TCG^TOTGGGACCCTATCCTGCTCCAGAGarrc^ 

CCTTTGCGTACTCAGTGTGGACACCTGTATGCCTACGACTGGATTAGTATC 800 
CCACTGGTGTATACACAGGTGGTGACTGTGGCGGTGTACAGCTTCTTCCT 850 
GACITGTCTAGTrGGGCGGCAGrrTCTGAACCCAGCCAAGGCC^^ 900 
GCCATGAGCTGGACCTCGTTCTGCCCGTCTTCACGTTTCTGCAGTTCTTC 
TTCTATGTTCGCTGGCrcAAGGTGGGCCTCTCCAGGGCCCTGCTGGGCTG 
GAGGCATGGCCAGAGGGGTCATGGCCAGCAGCTGCTTGAGACGAGGATGC 
AGTGTCAGGAAAGGAAGGTCTCACGGGTAGAAAGCAGCCAGGCGTGGTGG 
CGCACACCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATCGCT 
TGAACCCGGGAGGCGGAGGTrGTGGTGGCAGAGCAGCTCATCAACCCCTr 1200 
TGGAGAGGATGATGATGATTTTGAGACCAACTGGATTGTCGACAGGAATT 1250 
TGCAGGTGTCCCTGTTGGCTGTGGATGAGATGCACCAGGACCTGOCTCGG 1300 
A I G ^ AG £H GGACATGTAC ^ AATAAGCCCGA ^ CA CAGCrc 1350 

AGCTGCTTCCGCtXAGTTCCGTCGAGCCTCXnTrATGGGCTCCACCTTCA 1400 
ACATCAGCCTGAACAAAGAGGAGATGGAGTTrr Afirrr a iTTA^*r^ 4 r , A m 
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aui_ i ITCCGCCX.AGTTCCGTCGAGCCIXXTITATGGGCTCCACCTTCA 
ACATCAGCCTGAACAAAGAGGAGATGGAGTTCCAGCCCAATCAGGAGGAC 
GAGGAGGATGCTCACGCTCGCATCATTGGCCGCTTCCTAGGCCTGCAGTC 
CCATGATCACCATCCTCCCAGGGCAAACTCAAGGACCAAACTACTGTGGC 
CCAAGAGGGAATCCCTTCTCCACGAGGGCCTGCCCAAAAACCACAAGGCA 
GCCAAACAGAACGTTAGGGGCCAGGAAGACAACAAGGCCTGGAAGCTTAA 
GGCTGTGGACGCCTTCAAGTCIXKKTCX^CTGTATCAGAGGCCAGGCTACT 1700 
ACAGTGCCCCACAGACGCCCCTCAGCCCCACTCCCATGTirTTCCCCCTA 1750 
?^5? A T CAG £S^ TCA ^ GC ^ ACA ^^^ G ^ATAGACACCAA 1800 
AGACAAAAGCrTAAAGACTGTGAGTTCTGGGGCCAAGAAAAGTTTTGAAT 
TGCTCTCAGAGAGCGATGGGGCCTTGATGGAGCACCCAGAAGTATCTCAA 
GTGAGGAGGAAAACTGTGGAGTTTAACCTGACGGATATGCCAGAGATCCC 
CGAAAATCACCTCAAAGAACCTTTGGAACAATCACCAACCAACATACACA 
CTACACTCAAAGATCACATGGATCCTTATTGGGCCTTGGAAAACAGGGAT 
GAAGCACATTCCTAACCTtKTTCCTAATGGGGATGCTTCGCCAGCCAGGT 
C^ACCTGTGTGTACACCAGCAGGACACTGATCCAG-TCACAGCCATACA 
GCrGTCCACACTGAAGAACGTGTCCTACAACAGCCTGAATCAAATGGTTA 
GCn'AATAGATAAAAATCCCAGACTACTTCAGCCTTrAATCCCTTTTATT 
CATAAAAACTGTGAAAGCTAGACTGAACCATTGGAAACATTTAACTCAGA 
CTCTGGATrCAGAGTCCKXJAAtXXn^ 

CACACCrrAGTATACTGCCCAAACTAATGAGTITAATAAATACAAATACT 
^^AAAAAAAAAAAAAAAAAAAAAAAAA (SEQiD.NO -4) 
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FIGURE 5 



MTmnrSQ V AN ARLG SFSRLLLCWRGS IYKLL YGEFLIFLLC YYIIRFIY 50 
RLAi;iTEEQQLMFEKLTLYCDSyiQLIPISFVLGFYVn.VVTRVAVNOYENL 

PWPTSBI M5II VSfSFVPnVTkPrWTDT I DDT! IDVAkn r.vnn ti nmH<TMiwr 



100 
150 



PWJmMSLVSGFVEGKDE(XJRIJJWTLmYANLGNVLILRSVSTAVYKR idu 

FPSA^LVQAGFMTPAEHKQl^Kl^U'HNMFWVPWVWFANl^MKAWLGGR 200 

IRDPiLLQSliNEMhniJRTQCGHLYAYDWISIPLVYTQVVTVAVYSFPLT 250 

CLVGRQFLNPAKAYPGHELDLVVPVFTFLQFFFYVGWLKVGLSRAI^WR 300 

HGQRGHGQQLLETRMQCQERKVSRVESSQAWWRTPV1PATREAEAGESLE 350 

PGREXLWWQSSSSTPLERMMM^ 400 
SRTCTGBPSHSPPTQLLPPSSVEPPLWAPPSTSA (SEQ.ID.NO.:5) 
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FIGURE 7 



fienBank/SwissPrnt 
accession nnmh> n 



Protein seg..» n rf 



SEO.TD.NO. 



CG1CE_ protein 

af016687 ( PID:g2315833 ) 

273105 <PID:e242363) 

z73422 (PID:e244423) 

273422 (PID: e244542 ) 
P34577 
P34672 
p34319 
Z68335 
Z68753 



IPISFVLGFX VTLWTRffWN QYENLFWPDR 2 (part) 



(PID:e217363) 
(PID:e218704) 
af 025458 ( PID: €2429439 ) 
U28412 (PID:g849242) 
(PID:gl572760) 
(PID:e351507) 



U70848 
281074 
q09379 
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IPINFMLGFt 
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IPLTFLLGPX 

IPLTFLLGFX 

IPLSFLLGFZ. 

VPMQFMLGYt 



VTIIVGRHND 

VTirVRROND 

VTIWDRHTK 

VTTWNRffTK 

IAGVLRR£WY 

VTAWNRttTY 

CNIIIRRfiLK 

VTTVINRHMT 

VSFWAR|£GS 

VSIVYNRJITK 

VTTVFERJ£RS 

VSNWSRiJWR 

VSNWARKWR 

VAMIVRRKWD 

VSLIVARHWE 

IGMVGERUGE 



IFLNIGWVDN 28 
IFANLGWVEN 29 
LWRTVGFIDD 3 0 
LYQTIGFIDN 31 
LYDIIGFIDN 32 
LYQIIGFIDN 33 
LYTSLGNIDN 34 
QFANLGMIDN 3 5 
ILNGIGWIDD 3 6 
VFDNVGWIDT 37 
ALNVMPFIES 3 8 
QFETLRWPED 3 9 
QFETLYWPED 4 0 
CCQLISWPDH 41 
QFNCISWPDK 42 
SFENVSYIEK 43 
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1 GTGCCAAGCCATGACTATCACCTACAGAAACAAAGTAGC 6 0 

1 MTITYTNKVANARLGSF 17 

61 CTCGTCCCTCCTCCTGTGCTGGCGAGGCAGCATCTA 12 0 

18 SSLLLCWRGS IYKLLYGEFL 37 

121 TGTCTTCATATTCCTCTACTATTCCATCCGTGGACTCTACAGAATGGTTC 180 

38 VFIFLYYSIRGLYRMVLSSD 57 

• 

181 TCAGCAGCTGTTGTTTGAGAAGCTGGCTCTGTACTGCGACAGCTACATTCA 240 

58 QQLLFEKLALYCDSYI QLIP 77 

• • • • • • 

2 41 TATATCCTTCGTTCTGGGTTTCTATGTTA 300 

78 ISFVLGFYVTLVVSRWWSQY 97 

301 CGAGAACTTGCCGTGGCCCGACCGCCTCATGATCCAGGTGTCTAGCTTCG 360 



98 ENLPWPDRLMI QVS SFVEGK 



117 



3 61 GGATGAGGAAGGCCGTTTGCTGC 420 

118 DEEGRLLRRTLIRYAILGQV 137 

421 GCTCATCCTGCGCAGCATCAGC^CCTCGGTCTACAAGCGCTTTC 480 

138 LILRSISTSVYKRFPTLHHL 157 

481 GGTGCTAGCAGGTTTTATGACCCATGGGGAACATAAGCAGTTGCAGAAGTTGGGCCTACC 540 

158 VLAGFMTHGEHKQLQKLGLP 177 

541 ACACAAC^CATTCTGGGTGCCCTGGGTGTGGTTTGCCAA 6 °° 

178 HNTFWVPWVWFANLSMKAYL 197 

601 TGGAGGTCGAATCCGGGACACCGTCCTGCTCCAGAGCCTGATGAATGAGGTGTGTACT^ 660 

198 GGRIRDTVLLQSLMNEVCTL 217 

# * 

661 GCGTACTCAGTGTGGACAGCTGTATGCCTACGACTGGATAAGTATCCCATTGGTGTACAC 720 

218 RTQCGQLYAYDWISIPLVYT 237 

721 ACAGGTGGTGACAGTGGCAGTATACAGCTTTTTCCTTGCATGCT^ 780 
238 QVVTVAVYSFFLACLIGRQF 
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• • ♦ * » * 

781 TCTGAACCCAAACAAGGACTACCCAGGCCATGAGATGGATCTGG 840 
258 LNPNKDY PGHEMDLVVPVFT 277 



841 AATCCTGCAATTCTTATTCTACA^ 900 
278 ILQFLFYMGWLKVAEQLINP 297 



901 CTTCGGGGAGGACGATGATGATTTTGAGACTAACTGGA 960 
298 FGEDDDDFETNWI IDRNLQV 317 



961 GTCCCTGTTGTCCGTGGATGGGATGCACCAGAACTTGCCTC 1 02 0 

318 SLLSVDGMHQNLPPMERDMY 337 



1021 CTGGAACGAGGCAGCGCCTCAGCCGCCCTACACAGCTGCTTCTGCCAGGTCTC 1080 
338 WNEAAPQPPYTAASARSRRH 357 



1081 TTCCTTCATGGGCTCCACCTTCAACATCAGCCTAAAGAAAGAAGACTT 1140 
358 SFMGSTFNISLKKEDLELWS 377 



■ • * • • • 

1141 AAAAGAGGAGGCTGACACGGATAAGAAAGAGAGTGGCTATAGCAGCACCATAGGCTGCTT 1200 
378 KEEADTDKKESGYSS TIGCF 397 



* • • • • • 

1201 CTTAGGACTGCAACCCAAAAACTACCATCTTCCCTTGAAAG 1260 
398 LGLQPKNYHLPLKDLKTKLL 417 



• • • • • • 

12 6 1 GTGTTCTAAGAACCCCCTCCTCGAAGGCCAGTGTAAGGATGCCAACCAGAAAAACCAGAA 1320 
418 CSKNPLLEGQCKDANQKNQK 437 



• • • • * * 

1321 AGATGTCTGGAAATTTAAGGGTCTGGACTTCTTGAAATGTC 1380 
438 DVWKFKGLDFLKCVPR FKRR 457 



1381 AGGCTCCCA'^TGGCCCAC^GGCACCCAGCAGCC^CCCT 1440 
458 GSHCG'PQAPSSHPTEQSAFS 477 



1441 CAGTTCAGAOICAGGTGATGGGCCTTC^ 1500 
478 SSDTGDGPSTDYQEICHMKK 497 



1501 GAAAACTGTGGAGTTTAACTTGAACATTCCAGAGAGCCCCA 1560 
498 KTVEFNLNIPESPTEHLQQR 517 



15 61 CCGTTTGGACCAGATGTCAACCAATATACAGGCTC 1620 
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518 RLDQMSTNIQALMKEHAESY 537 

• * • • • • 

1621 TCCCTACAGGGATGAAGCTGGCACCAAACCTGTTCTCTATGAGTGATGCCTCACAGCCTG 1680 
538 PYRDEAGTKPV LYE 551 

• ••••• 

1681 GCCCTGACTTGCAAGGATGCCCAGCAGG^ 1740 

. • • • • • 

17 4 1 ACACCCAGGAGTGTGTTCCCACGACAGTCTAGCATGTAACTCAGAACCAAGAGT^ 1800 

• • • • • • 

1801 TAGTCCTGCCTGAAAACACCTGTATTTTACGATCTTTCCC^ 1860 

. • • • * 

1861 CGTGAATATTCTTTTAGGTGAAAAAAAAAAAAAAAAAAAAAA^ 1916 
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! !AA_MULTIPLE_ALIGNMENT 1.0 
PileUp of: *] 

Symbol canparison table: GenRunData:blosum62.cirp ConpCheck: 6430 

GapWeight: 12 
GapLengthWeight: 4 

pileup.msf MSF: 596 Type: P October 1, 1998 10:43 Check: 124 

Name: Human Len: 596 Check: 3272 Weight: 1 00 

Name: MouseBestrophin] Len: 596 Check: 6852 Weight: 1.00 



1 50 
Human MTITYTSQVA NARLGSFSRL LLCWRGSIYK LLYGEFLIFL LCYYIIRFIY 
MouseBestrophin] MTITYTNKVA NARLGSFSSL LLCWRGSIYK LLYGEFLVFI FLYYSIRGLY 

51 100 
Human RLALTEEQQL MFEKLTLYCD SYIQLIPISF VLGFYVTLW TRWWNQYENL 
MouseBestrophin] RMVLSSDQQL LFEKLALYCD SYIQLIPISF VLGFYVTLW SRWWSQYENL 

101 * 150 

Human PWPDRLMSLV SGFVEGKDEQ GRLLKRTLIR YANLGNVLIL RSVSTAVYKR 
MouseBestrophin] FWPDRLMIQV SSFVEGKDEE GRLLKRTLIR YAILGQVLIL RSISTSVYKR 

151 200 
Human FPSAQHLVQA GFMTPAEHKQ LEKLSLPHNM FWVFWVWFAN LSMKAWLGGR 
MouseBestrophin] FPTLHHLVLA GFMTHGEHKQ LQKLGLPHNT FWVPWVWFAN LSMKAYLGGR 

201 250 
Human IRDPILLQSL LNEMNTLRTQ CGHLYAYDWI SIPLVYTQW TVAVYSFFLT 
MouseBestrophin] IRDTVLLQSL MNEVCTLRTQ CGQLYAYDWI SIPLVYTQW TVAVYSFFLA 

251 300 
Human CLVGRQFLNP AKAYPGHELD LWPVFTFLQ FFFYVGWLKV AEQLINPFGE 
MouseBestrophin] CLIGRQFLNP NKDYPGHEMD LWFVFTILQ FLFYMGWLKV AEQLINPFGE 

301 350 
Human DDDDFETNWI VDRNLQVSLL AVDEMHQDLP RMEPDMYWNK PEPQPFYTAA 
MouseBestrophin] DDDDFETNWI IDRNLQVSLL SVDGMHQNLP FMERDMYWNE AAPQPPYTAA 

351 400 

Human SAQFRRASFM GSTFNISLNK EEMEFQPNQE DEEDAH AGIIGRFLGL 

MouseBestrophin] SARSRRHSFM GSTFNISLKK EDLELWSKEE ADTDKKESGY SSTIGCFLGL 

401 450 
Human QSHDHHPPRA NSRTKLLWPK RESLLHEGLP KNHKAAKQNV RGQEDNKAWK 
MouseBestrophin] QPKNYHLPLK DLKTKLLCSK NPLL. .EGQC KD ANQ KNQKD. .VWK 



Human 

MouseBestrophin ] 



451 500 
LKAVDAFKSA PLYQRPGYYS APQTPLSPTP MFFPLEPSAP SKLHSVTGID 
FKGLDFLKCV PRFKRRGSHC GPQAPSS HPTEQSAP SS . . SDTG . . 
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501 550 
Human TKDKSLKTVS SGAKKSFELL. SESDGALMEH PEVSQVRRKT VEFNLTDMPE 
MouseBestl^ilin] DGPSTDY QEICHMKKKT VEFNL.NIPE 



551 596 

Hainan IPENHLKE.P LEQSPTNIHT TLKDHMDPYW ALENRDEAHS 

MouseBestr<jgTfti n] SPTEHLQQRR LDQMSTNIQA LMKEHAESY. . . PYRDEAGT KPVLYE 



